Page 203 - 《软件学报》2021年第12期
P. 203
谌明 等:一种基于注意力联邦蒸馏的推荐方法 3867
[23] Cho JH, Hariharan B. On the efficacy of knowledge distillation. In: Proc. of the IEEE Int’l Conf. on Computer Vision. Seoul: IEEE,
2019. 4794−4802.
[24] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. In: Proc. of the NIPS Deep Learning and
Representation Learning Workshop. 2015.
[25] Yim J, Joo D, Bae J, et al. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In:
Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017. 4133−4141.
[26] Heo B, Lee M, Yun S. Knowledge distillation with adversarial samples supporting decision boundary. In: Proc. of the AAAI Conf.
on Artificial Intelligence. Hawaii: AAAI, 2019. 33:3771−3778.
[27] Yang C, Xie L, Su C, et al. Snapshot distillation: Teacher-student optimization in one generation. In: Proc. of the IEEE Conf. on
Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 2859−2868.
[28] Cha H, Park J, Kim H, et al. Federated reinforcement distillation with proxy experience memory. In: Proc. of the 1st Int’l
Workshop on Federated Machine Learning for User Privacy and Data Confidentiality (FML 2019). 2019.
[29] Guo H, Tang R, Ye Y, et al. DeepFM: A factorization-machine based neural network for CTR prediction. In: Proc. of the 26th Int’l
Joint Conf. on Artificial Intelligence. Melbourne: IJCAI.org, 2017. 1725−1731.
[30] Hu C, Meng XW, Zhang YJ, et al. Enhanced group recommendation method based on preference aggregation. Ruan Jian Xue
Bao/Journal of Software, 2018,29(10):3164−3183 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5288.htm
[doi: 10.13328/j.cnki.jos.005288]
[31] Xiao J, Ye H, He X, et al. Attentional factorization machines: Learning the weight of feature interactions via attention networks. In:
Proc. of the 26th Int’l Joint Conf. on Artificial Intelligence. Melbourne: IJCAI.org, 2017. 3119−3125.
[32] Wang Q, Liu FA, Xing S, et al. A new approach for advertising CTR prediction based on deep neural network via attention
mechanism. In: Proc. of the Computational and Mathematical Methods in Medicine. 2018.
[33] Li T, Sahu AK, Talwalkar A, et al. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing
Magazine, 2020,37(3):50−60.
[34] Wang X, Han Y, Wang C, et al. In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated
learning. IEEE Network, 2019,33:156−165.
[35] Wu B, Lou ZZ, Ye YD. Co-regularized matrix factorization recommendation algorithm. Ruan Jian Xue Bao/Journal of Software,
2018,29(9):2681−2696 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5274.htm [doi: 10.13328/j.cnki.jos.
005274]
[36] Cui X, Zhang W, Tüske Z, et al. Evolutionary stochastic gradient descent for optimization of deep neural networks. In: Proc. of the
Neural Information Processing Systems. Montréal: JMLR, 2018. 6048−6058.
[37] Cutkosky A, Orabona F. Momentum-based variance reduction in non-convex SGD. In: Proc. of the Neural Information Processing
Systems. Jaipur: JMLR, 2019. 15236−15245.
[38] Wang Y, Zhou P, Zhong W. An optimization strategy based on hybrid algorithm of Adam and SGD. In: Proc. of the MATEC Web
of Conf. 2018. 232.03007.
[39] Huang G, Liu Z, Van Der Maaten L. Densely connected convolutional networks. In: Proc.of the IEEE Conf. on Computer Vision
and Pattern Recognition. Honolulu: IEEE, 2017. 4700−4708.
[40] Li Q, Tai C, Weinan E. Stochastic modified equations and adaptive stochastic gradient algorithms. In: Proc. of the Int’l Conf. on
Machine Learning (ICML). 2017. 2101−2110.
[41] Bottou L. Stochastic gradient descent tricks. In: Proc. of the Neural Networks: Tricks of the Trade. Heidelberg: Springer-Verlag,
2012. 421−436.
[42] Harper FM, Konstan JA. The movielens datasets: History and context. ACM Trans. on Interactive Intelligent Systems (TIIS), 2015,
5(4):1−19.
[43] Yu H, Li JH. Algorithm to solve the cold-start problem in new item recommendations. Ruan Jian Xue Bao/Journal of Software,
2015,26(6):1395−1408 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4872.htm [doi: 10.13328/j.cnki.jos.
004872]
[44] Cheng HT, Koc L, Harmsen J, et al. Wide & deep learning for recommender systems. In: Proc.of the 1st Workshop on Deep
Learning for Recommender Systems. Boston: DLRS, 2016. 7−10.
[45] Zhou G, Zhu X, Song C, et al. Deep interest network for click-through rate prediction. In: Proc. of the 24th ACM SIGKDD Int’l
Conf. on Knowledge Discovery & Data Mining. London: ACM, 2018. 1059−1068.