## Shengbin Wu* and Yibai Wang*## |

HC | SMC | EMCI | LMCI | AD | |
---|---|---|---|---|---|

Number of subjects | 210 | 82 | 272 | 187 | 160 |

Sec | |||||

Male | 109 | 33 | 153 | 108 | 95 |

Female | 101 | 49 | 119 | 79 | 65 |

Age(yr) | 76.13±6.54 | 72.45±5.67 | 71.51±7.11 | 73.86±8.44 | 75.18±7.88 |

Education (yr) | 16.44±2.62 | 16.78±2.67 | 16.07±2.62 | 16.38±2.81 | 15.86±2.75 |

In terms of experimental settings, we use a 10-fold cross-validation strategy, which equally divides the dataset into 10 points. Each time 9 sub-datasets are used as the training set, and the remaining 1 sub-dataset is used as the test set, so that the average value of 10 cycles is used as the final results. Regardint to the statistical standards of the model, we use the traditional accuracy (ACC), sensitivity (SEN), specificity (SPE), and the area under the receiver operating characteristic curve (AUC). For multi-task classification tasks, we first separately classify each task. Subsequently the federated learning is adopted to integrate all classification tasks, and then we compare the results with individual classification results. In addition, for the align process of the distributed data, we will show the changes in the data distribution during the experiment. Finally, the convergence of the proposed general framework algorithm is analyzed.

In this paper, we will compare the proposed model with traditional multi-task classification models, distributed multi-task classification models, and federated distributed multi-task classification models. Zhang and Shen [21] proposed a SVM-based ADNI multi-task classification model in 2012, which is a great improvement compared with the independent classification model. The distributed multi-task learning (DMTL) proposed by Wang et al. [12] uses a simple linear classification model for each task to integrate multiple tasks with group lasso constraints. Meanwhile, scholars have introduced two new subspace pursuit methods in the DMTL: the distributed greedy subspace pursuit (DGSP) [28] and the dual pursuit for subspace learning (DPSL) [29]. The two methods can effectively simplify the calculation process and speed up the convergence of the model, prematurely avoiding the model falling into a poor local optimal solution. Smith et al. [30] introduced the novel federated learning into the distributed multitask learning (FDMTL). Their work mainly solved the problem of the unbalanced data distribution and dealt with system challenges in federated learning. Simth et al. [30] proposed the federated multi-task learning (FMTL) to handle the statistical challenges and to set a novel system-aware optimization method.

Figs. 2 and 3 show the classification results of popular comparison methods and our proposed model. Evidently, in a relatively small dataset, our proposed model has achieved the best performance. Table 2 reports the detailed information of UCI datasets. Table 3 present the classification results from state-ofthe- art methods and our model in six UCI datasets. Benefit from the adaptive group sparsity adjustment constraint, we can accurately determine the degree of correlation between different data sources. Meanwhile, from Figs. 2 and 3, compared with the single-task model, the multi-task federated learning model exhibits a better classification effect and displays a better generalization ability. In the experimental operation stage, our model takes less time to obtain the convergence value. For approximate functions, the proposed general algorithm solution framework avoids non-convex calculations, and encourages the model to better obtain local optimal solutions. The comparison models based on deep model structures such as neural networks, cannot demonstrate the effectiveness in the situation where experimental data are more precious and scarce, and it is difficult for these models to obtain good convergence results in a short time.

Table 2.

Datasets | Dimension | Number |
---|---|---|

Heart | 22 | 157/110 |

Parkinsons | 22 | 147/48 |

Haberman | 3 | 225/81 |

Diabetic | 19 | 540/611 |

Breast | 9 | 458/241 |

Table 3.

Datasets | SVM | DMTL | DGSP | DPSL | FDMTL | FMTL | FMLADA | |
---|---|---|---|---|---|---|---|---|

Heart | ACC | 85.42±0.029 | 85.97±0.017 | 85.40±0.034 | 84.43±0.061 | 85.23±0.019 | 87.29±0.011 | 88.37±0.012 |

AUC | 83.13±0.031 | 86.22±0.014 | 84.11±0.053 | 81.89±0.009 | 81.26±0.054 | 76.47±0.019 | 87.43±0.055 | |

Parkinsons | ACC | 81.82±0.033 | 82.35±0.031 | 83.96±0.005 | 86.63±0.004 | 87.70±0.061 | 64.17±0.021 | 89.84±0.023 |

AUC | 81.09±0.049 | 80.43±0.012 | 79.84±0.009 | 82.36±0.017 | 83.02±0.032 | 50.23±0.028 | 85.54±0.046 | |

Haberman | ACC | 75.08±0.021 | 76.04±0.023 | 76.07±0.025 | 76.33±0.025 | 76.20±0.017 | 76.33±0.009 | 77.58±0.004 |

AUC | 67.10±0.054 | 73.18±0.035 | 74.32±0.063 | 74.22±0.013 | 71.11±0.031 | 50.85±0.019 | 74.47±0.036 | |

Diabetic | ACC | 91.41±0.033 | 91.20±0.059 | 91.48±0.064 | 91.25±0.029 | 91.41±0.019 | 90.33±0.015 | 91.53±0.055 |

AUC | 86.79±0.016 | 89.48±0.023 | 83.39±0.055 | 89.75±0.036 | 88.62±0.026 | 65.68±0.048 | 90.41±0.017 | |

Breast | ACC | 73.07±0.062 | 73.55±0.019 | 74.20±0.028 | 73.51±0.066 | 73.66±0.057 | 72.79±0.015 | 74.42±0.022 |

AUC | 63.54±0.011 | 67.32±0.015 | 65.91±0.023 | 68.33±0.001 | 68.48±0.068 | 61.47±0.039 | 68.52±0.003 |

In this paper, we propose a general FMTL model framework. The [TeX:] $$l_{2, p}$$ norm is used to adaptively constrain distributed data so that the sparsity of the multi-task model can be controlled during the federated learning process. However, for the non-convexity of the constraint norm, we adopt an approximation method to transform it into a convex function, and prove the rationality and superiority of this method. While ensuring the function convergence, our proposed method is able to minimize the error between the original function and the approximate function. The experimental results also verify the effectiveness. Compared with other popular models, our model achieves the best results, and the algorithm can also obtain an improved local solution in a limited number of iterations.

In future work, we will also explore massive multi-task datasets. Due to the expansion of datasets, the current model might be very slow and inefficient in mining the correlation between data. We will adopt a simple and effective feature extraction model to abstract all data, and the federated learning will guide the iterative optimization in each step of the model.

- 1
*J. Konecny, H. Brendan McMahan, F. X. Y u, A. T. Suresh, D. Bacon, and P . Richtarik, 2017 (Online). Available:*, https://arxiv.org/abs/1610.05492 - 2 H. Brendan McMahan, E. Moore, D. Ramage, S. Hampson, B. A. Y. Arcas, "Communication-efficient learning of deep networks from decentralized data," in
*Proceedings of the 20th International Conference on Artificial Intelligence and Statistics*, Fort Lauderdale, FL, 2017;pp. 1273-1282. custom:[[[-]]] - 3 Q. Yang, Y. Liu, T. Chen, Y. Tong, "Federated machine learning: concept and applications,"
*ACM Transactions on Intelligent Systems and Technology (TIST)*, vol. 10, no. 2, 2019.doi:[[[10.1145/3298981]]] - 4
*H. Brendan McMahan, E. Moore, D. Ramage, and B. A. Y . Arcas, 2016 (Online). Available:*, https://arxiv.org/abs/1602.05629v1 - 5 Y. Xue, X. Liao, L. Carin, B. Krishnapuram, "Multi-task learning for classification with Dirichlet process priors,"
*Journal of Machine Learning Research*, vol. 8, pp. 35-63, 2007.custom:[[[-]]] - 6 S. J. Pan, Q. Y ang, "A survey on transfer learning,"
*IEEE Transactions on Knowledge and Data Engineering*, vol. 22, no. 10, pp. 1345-1359, 2010.doi:[[[10.1109/TKDE.2009.191]]] - 7 X. T. Y uan, X. Liu, S. Yan, "Visual classification with multitask joint sparse representation,"
*IEEE Transactions on Image Processing*, vol. 21, no. 10, pp. 4349-4360, 2012.doi:[[[10.1109/TIP.2012.2205006]]] - 8 L. Argote, E. Miron-Spektor, "Organizational learning: from experience to knowledge,"
*Organization Science*, vol. 22, no. 5, pp. 1123-1137, 2011.doi:[[[10.1287/orsc.1100.0621]]] - 9 C. V ens, J. Struyf, L. Schietgat, S. Dzeroski, H. Blockeel, "Decision trees for hierarchical multi-label classification,"
*Machine Learning*, vol. 73, no. 2, pp. 185-214, 2008.doi:[[[10.1007/s10994-008-5077-3]]] - 10 T. Evgeniou, M. Pontil, "Regularized multi-task learning," in
*Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, Seattle, WA, 2004;pp. 109-117. custom:[[[-]]] - 11 S. Rosen, Z. Qian, Z. M. Mao, "Appprofiler: a flexible method of exposing privacy-related behavior in android applications to end users," in
*Proceedings of the 3rd ACM Conference on Data and Application Security and Privacy*, San Antonio, TX, 2013;pp. 221-232. custom:[[[-]]] - 12 J. Wang, M. Kolar, N. Srerbo, "Distributed multi-task learning," in
*Proceedings of the 19th International Conference on Artificial Intelligence and Statistics*, Cadiz, Spain, 2016;pp. 751-760. custom:[[[-]]] - 13 R. Tibshirani, "Regression shrinkage and selection via the lasso: a retrospective,"
*Journal of the Royal Statistical Societ B*, vol. 73, no. 3, pp. 273-282, 2011.custom:[[[-]]] - 14 R. G. Brereton, G. R. Lloyd, "Support vector machines for classification and regression,"
*Analyst*, vol. 135, no. 2, pp. 230-267, 2010.custom:[[[-]]] - 15 J. Wright, A. Ganesh, S. Rao, Y. Ma, "Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization,"
*Coordinated Science LaboratoryUniversity of Illinois, Urbana, IL, Report No. UILU-ENG-09-2210(DC-243)*, 2009.custom:[[[-]]] - 16 X. Ding, Y. Chen, Z. Tang, Y. Huang, "Camera identification based on domain knowledge-driven deep multi-task learning,"
*IEEE Access*, vol. 7, pp. 25878-25890, 2019.custom:[[[-]]] - 17 D. Mateos-Nunez, J. Cortes, J. Cortes, "Distributed optimization for multi-task learning via nuclear-norm approximation,"
*IF AC-PapersOnLine*, vol. 48, no. 22, pp. 64-69, 2015.custom:[[[-]]] - 18 M. Zhao, H. Zhang, W. Cheng, Z. Zhang, "Joint lp- and l2,p-norm minimization for subspace clustering with outlier pursuit," in
*Proceedings of 2016 International Joint Conference on Neural Networks (IJCNN)*, V ancouver, Canada, 2016;pp. 3658-3665. custom:[[[-]]] - 19 M. Zhang, Y. Yang, H. Zhang, F. Shen, D. Zhang, "L2,p-norm and sample constraint based feature selection and classification for AD diagnosis,"
*Neurocomputing*, vol. 195, pp. 104-111, 2016.custom:[[[-]]] - 20 R. Caruana, "Multitask learning,"
*Machine Learning*, vol. 28, no. 1, pp. 41-75, 1997.doi:[[[10.1023/A:1007379606734]]] - 21 D. Zhang, D. Shen, "Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease,"
*NeuroImage*, vol. 59, no. 2, pp. 895-907, 2012.doi:[[[10.1016/j.neuroimage.2011.09.069]]] - 22 Z. Hu, B. Li, J. Luo, "Time-and cost-efficient task scheduling across geo-distributed data centers,"
*IEEE Transactions on Parallel and Distributed Systems*, vol. 29, no. 3, pp. 705-718, 2018.custom:[[[-]]] - 23 Y. Wang, M. Nikkhah, X. Zhu, W. T. Tan, R. Liston, "Learning geographically distributed data for multiple tasks using generative adversarial networks," in
*Proceedings of 2019 IEEE International Conference on Image Processing (ICIP)*, Taipei, Taiwan, 2019;pp. 4589-4593. custom:[[[-]]] - 24 X. Cai, F. Nie, H. Huang, C. Ding, "Multi-class l2,1-norm support vector machine," in
*Proceedings of 2011 IEEE 11th International Conference on Data Mining*, V ancouver, Canada, 2011;pp. 91-100. custom:[[[-]]] - 25 P. Heins, M. Moeller, M. Burger, "Locally sparse reconstruction using the l1,∞-norm,"
*Inverse Problems Imaging*, vol. 9, no. pp.1093-1137, pp. no.1093-1137, 2015.custom:[[[-]]] - 26 P. E. Gill, W. Murray, M. A. Saunders, "SNOPT: an SQP algorithm for large-scale constrained optimization,"
*SIAM Review*, vol. 47, no. 1, pp. 99-131, 2005.doi:[[[10.1137/S1052623499350013]]] - 27 N. Tottenham, J. W. Tanaka, A. C. Leon, T. McCarry, M. Nurse, T. A. Hare, et al., "The NimStim set of facial expressions: judgments from untrained research participants,"
*Psychiatry Research*, vol. 168, no. 3, pp. 242-249, 2009.custom:[[[-]]] - 28 K. S. Kim, S. Y. Chung, "Greedy subspace pursuit for joint sparse recovery,"
*Journal of Computational and Applied Mathematics*, vol. 352, pp. 308-327, 2019.custom:[[[-]]] - 29 S. Yi, Y. Liang, Z. He, Y. Li, Y. M. Cheung, "Dual pursuit for subspace learning,"
*IEEE Transactions on Multimedia*, vol. 21, no. 6, pp. 1399-1411, 2019.custom:[[[-]]] - 30
*V . Smith, C. K. Chiang, M. Sanjabi, and A. Talwalkar, 2017 (Online). Available:*, https://arxiv.org/abs/1705.10467