Page 188 - 《软件学报》2021年第12期
P. 188
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail:
Journal of Software,2021,32(12):3852−3868 [doi: 10.13328/j.cnki.jos.006128]
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
谌 明, 张 蕾, 马天翼
(浙江省同花顺人工智能研究院,浙江 杭州 310012)
通讯作者: 谌明, E-mail:
摘 要: 数据隐私保护问题已成为推荐系统面临的主要挑战之一.随着《中华人民共和国网络安全法》的颁布和
达到较高的预测精度等问题.同时,随着 5G(the 5th generation mobile communication technology)时代的到来,个人设
备数据量和传输速率预计比当前提高 10~100 倍,因此要求模型执行效率更高.针对此问题,知识蒸馏可以将教师模
法,该方法首先在联邦蒸馏的目标函数中加入 Kullback-Leibler 散度和正则项,减少教师网络和学生网络间的差异性
52%,模型的准确率提升了 13%,平均误差减少 17%,NDCG 值提升了 10%.
关键词: 联邦学习;分布式学习;联邦蒸馏;推荐系统;注意力机制
中图法分类号: TP18
中文引用格式: 谌明,张蕾,马天翼.一种基于注意力联邦蒸馏的推荐方法.软件学报,2021,32(12):3852−3868. http://www.jos.
英文引用格式: Chen M, Zhang L, Ma TY. Recommendation approach based on attentive federated distillation. Ruan Jian Xue
Bao/Journal of Software, 2021,32(12):3852−3868 (in Chinese).
Recommendation Approach Based on Attentive Federated Distillation
CHEN Ming, ZHANG Lei, MA Tian-Yi
(Zhejiang HiThink RoyalFlush AI Research Institute, Hangzhou 310012, China)
Abstract: Data privacy protection has become one of the major challenges of recommendation systems. With the release of the
Cybersecurity Law of the People's Republic of China and the general data protection regulation in the European Union, data privacy and
security have become a worldwide concern. Federated learning can train the global model without exchanging user data, thus protecting
users' privacy. Nevertheless, federated learning is still facing many issues, such as the small size of local data in each device, over-fitting
of local model, and the data sparsity, which makes it difficult to reach higher accuracy. Meanwhile, with the advent of 5G (the 5th
generation mobile communication technology) era, the data volume and transmission rate of personal devices are expected to be 10 to 100
times higher than the current ones, which requires higher model efficiency. Knowledge distillation can transfer the knowledge from the
teacher model to a more compact student model so that the student model can approach or surpass the performance of teacher model, thus
effectively solve the problems of large model parameter and high communication cost. However, the accuracy of student model is lower
than teacher model after knowledge distillation. Therefore, a federated distillation approach is proposed with attentional mechanisms for
recommendation systems. First, the method introduces Kullback-Leibler divergence and regularization term to the objective function of
federated distillation to reduce the impact of heterogeneity between teacher network and student network; then it introduces multi-head
∗ 收稿时间: 2020-01-18; 修改时间: 2020-04-18; 采用时间: 2020-08-07