Page 203 - 《软件学报》2021年第12期
P. 203

谌明  等:一种基于注意力联邦蒸馏的推荐方法                                                           3867


         [23]    Cho JH, Hariharan B. On the efficacy of knowledge distillation. In: Proc. of the IEEE Int’l Conf. on Computer Vision. Seoul: IEEE,
             2019. 4794−4802.
         [24]    Hinton  G,  Vinyals  O,  Dean J.  Distilling the knowledge in  a neural  network. In: Proc. of the  NIPS Deep  Learning  and
             Representation Learning Workshop. 2015.
         [25]    Yim J, Joo D, Bae J, et al. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In:
             Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017. 4133−4141.
         [26]    Heo B, Lee M, Yun S. Knowledge distillation with adversarial samples supporting decision boundary. In: Proc. of the AAAI Conf.
             on Artificial Intelligence. Hawaii: AAAI, 2019. 33:3771−3778.
         [27]    Yang C, Xie L, Su C, et al. Snapshot distillation: Teacher-student optimization in one generation. In: Proc. of the IEEE Conf. on
             Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 2859−2868.
         [28]    Cha H,  Park  J, Kim H,  et al. Federated reinforcement distillation  with  proxy  experience  memory. In: Proc. of the 1st Int’l
             Workshop on Federated Machine Learning for User Privacy and Data Confidentiality (FML 2019). 2019.
         [29]    Guo H, Tang R, Ye Y, et al. DeepFM: A factorization-machine based neural network for CTR prediction. In: Proc. of the 26th Int’l
             Joint Conf. on Artificial Intelligence. Melbourne: IJCAI.org, 2017. 1725−1731.
         [30]    Hu C,  Meng XW, Zhang YJ,  et al.  Enhanced group recommendation  method based  on preference  aggregation.  Ruan Jian  Xue
             Bao/Journal of Software, 2018,29(10):3164−3183 (in  Chinese  with English abstract). http://www.jos.org.cn/1000-9825/5288.htm
             [doi: 10.13328/j.cnki.jos.005288]
         [31]    Xiao J, Ye H, He X, et al. Attentional factorization machines: Learning the weight of feature interactions via attention networks. In:
             Proc. of the 26th Int’l Joint Conf. on Artificial Intelligence. Melbourne: IJCAI.org, 2017. 3119−3125.
         [32]    Wang Q, Liu  FA, Xing  S,  et al.  A new  approach for  advertising  CTR  prediction based on deep neural network via  attention
             mechanism. In: Proc. of the Computational and Mathematical Methods in Medicine. 2018.
         [33]    Li T,  Sahu  AK, Talwalkar A,  et al. Federated learning:  Challenges,  methods,  and future directions. IEEE  Signal Processing
             Magazine, 2020,37(3):50−60.
         [34]    Wang X, Han Y, Wang C,  et al. In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated
             learning. IEEE Network, 2019,33:156−165.
         [35]    Wu B, Lou ZZ, Ye YD. Co-regularized matrix factorization recommendation algorithm. Ruan Jian Xue Bao/Journal of Software,
             2018,29(9):2681−2696  (in Chinese with English abstract).  http://www.jos.org.cn/1000-9825/5274.htm [doi:  10.13328/j.cnki.jos.
             005274]
         [36]    Cui X, Zhang W, Tüske Z, et al. Evolutionary stochastic gradient descent for optimization of deep neural networks. In: Proc. of the
             Neural Information Processing Systems. Montréal: JMLR, 2018. 6048−6058.
         [37]    Cutkosky A, Orabona F. Momentum-based variance reduction in non-convex SGD. In: Proc. of the Neural Information Processing
             Systems. Jaipur: JMLR, 2019. 15236−15245.
         [38]    Wang Y, Zhou P, Zhong W. An optimization strategy based on hybrid algorithm of Adam and SGD. In: Proc. of the MATEC Web
             of Conf. 2018. 232.03007.
         [39]    Huang G, Liu Z, Van Der Maaten L. Densely connected convolutional networks. In: Proc.of the IEEE Conf. on Computer Vision
             and Pattern Recognition. Honolulu: IEEE, 2017. 4700−4708.
         [40]    Li Q, Tai C, Weinan E. Stochastic modified equations and adaptive stochastic gradient algorithms. In: Proc. of the Int’l Conf. on
             Machine Learning (ICML). 2017. 2101−2110.
         [41]    Bottou L. Stochastic gradient descent tricks. In: Proc. of the Neural Networks: Tricks of the Trade. Heidelberg: Springer-Verlag,
             2012. 421−436.
         [42]    Harper FM, Konstan JA. The movielens datasets: History and context. ACM Trans. on Interactive Intelligent Systems (TIIS), 2015,
             5(4):1−19.
         [43]    Yu H, Li JH. Algorithm to solve the cold-start problem in new item recommendations. Ruan Jian Xue Bao/Journal of Software,
             2015,26(6):1395−1408  (in Chinese with English abstract).  http://www.jos.org.cn/1000-9825/4872.htm [doi:  10.13328/j.cnki.jos.
             004872]
         [44]    Cheng HT, Koc L, Harmsen  J,  et al.  Wide  & deep learning for  recommender systems. In: Proc.of the 1st  Workshop on Deep
             Learning for Recommender Systems. Boston: DLRS, 2016. 7−10.
         [45]    Zhou G, Zhu X, Song C, et al. Deep interest network for click-through rate prediction. In: Proc. of the 24th ACM SIGKDD Int’l
             Conf. on Knowledge Discovery & Data Mining. London: ACM, 2018. 1059−1068.
   198   199   200   201   202   203   204   205   206   207   208