Page 66 - 《软件学报》2024年第4期
P. 66

1644                                                       软件学报  2024 年第 35 卷第 4 期

         [14]     Finn C, Levine S. Meta-learning: From few-shot learning to rapid reinforcement learning. In: Proc. of the Int’l Conf. on Machine
             Learning. 2019.
         [15]     Vanschoren J. Meta-learning: A survey. 2018: 1−29.
         [16]     Huisman M, Van Rijn JN, Plaat A. A survey of deep meta-learning. Artificial Intelligence Review, 2021, 54(6): 4483−4541.
         [17]     Li FC, Liu Y, Wu PX, et al. A survey on recent advances in meta-learning. Chinese Journal of Computers, 2021, 44(2): 422−446
             (in Chinese with English abstract).
         [18]     Tan XY, Zhang Z. Review on meta reinforcement learning. Journal of Nanjing University of Aeronautics & Astronautics, 2021,
             53(5): 653−663 (in Chinese with English abstract).
         [19]     Yadav P, Mishra A,  Lee J, et al. A  survey  on deep reinforcement learning-based approaches for adaptation and generalization.
             arXiv:2202.08444, 2022.
         [20]     Levine S. Reinforcement learning and control as probabilistic inference: Tutorial and review. arXiv:1805.00909, 2018.
         [21]     Zhao CY, Lai J. Survey on meta reinforcement learning. Application Research of Computers, 2023, 40(1): 1−10 (in Chinese with
             English abstract).
         [22]     Beck J, Vuorio R, Liu EZ, et al. A survey of meta-reinforcement learning. arXiv:2301.08028, 2023.
         [23]     Mnih  V,  Kavukcuoglu  K, Silver  D,  et al. Human-level  control  through deep  reinforcement learning. Nature,  2015,  518(7540):
             529−533.
         [24]     Wang Z, Schaul T, Hessel M, et al. Dueling network architectures for deep reinforcement learning. In: Proc. of the Int’l Conf. on
             Machine Learning, Vol.48. 2016. 1995−2003.
         [25]     Hausknecht M, Stone P. Deep recurrent Q-learning for partially observable mdps. In: Proc. of the AAAI Fall Symp. 2015. 29−37.
         [26]     Willia RJ. Simple  statistical gradient-following algorithms for connectionist reinforcement learning. Machine  Learning,  1992,
             8(3−4): 229−256.
         [27]     Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. In: Proc. of the Int’l Conf. on Learning
             Representations. 2016.
         [28]     Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms. arXiv:1707.06347, 2017.
         [29]     Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic
             actor. In: Proc. of the Int’l Conf. on Machine Learning, Vol.80. 2018. 1856−1865.
         [30]     Fujimoto S, Van Hoof H, Meger D. Addressing function approximation error in actor-critic methods. In: Proc. of the Int’l Conf. on
             Machine Learning. 2018. 1587−1596.
         [31]     Finn C. Learning to Learn with Gradients. Berkeley: University of California, 2018.
         [32]     Rakelly K, Zhou A, Finn C, et al. Efficient off-policy meta-reinforcement learning via probabilistic context variables. In: Proc. of
             the Int’l Conf. on Machine Learning. 2019. 5331−5340.
         [33]     Fakoor R, Chaudhari P, Soatto S, et al. Meta-Q-learning. In: Proc. of the Int’l Conf. on Learning Representations. 2020.
         [34]     Zintgraf  L, Schulze S,  Lu  C,  et al.  VariBAD:  Variational  bayes-adaptive deep  RL  via  meta-learning. The Journal of Machine
             Learning Research, 2021, 22(1): 13198−13236.
         [35]     Nichol A, Achiam J, Schulman J. On first-order meta-learning algorithms. arXiv:1803.02999, 2018.
         [36]     Gordon J,  Bronskill  J,  Nowozin  S,  et al. Meta-learning  probabilistic inference for prediction.  In: Proc. of the  Int’l  Conf.  on
             Learning Representations. 2018.
         [37]     Santoro A, Bartunov S, Botvinick M, et al. Meta-learning with memory-augmented neural networks. In: Proc. of the Int’l Conf. on
             Machine Learning. 2016. 1842−1850.
         [38]     Ramalho T, Garnelo M. Adaptive posterior learning: few-shot learning with a surprise-based memory module. In: Proc. of the Int’l
             Conf. on Learning Representations. 2019.
         [39]     Qiao S, Liu C, Shen W, et al. Few-shot image recognition by predicting parameters from activations. In: Proc. of the IEEE/ CVF
             Conf. on Computer Vision and Pattern Recognition. IEEE, 2018. 7229−7238.
         [40]     Gidaris S, Komodakis N, Paristech P, et al. Dynamic few-shot visual learning without forgetting. In: Proc. of the IEEE Computer
             Society Conf. on Computer Vision and Pattern Recognition, Vol.9. 2018. 4367−4375.
         [41]     Koch G. Siamese Neural Networks for One-Shot Image Recognition. University of Toronto, 2015.
         [42]     Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning. In: Proc. of the Advances in Neural Information
             Processing Systems (NIPS). 2016. 3637−3645.
         [43]     Snell J,  Swersky  K,  Zemel  R. Prototypical networks  for  few-shot learning.  In: Proc. of the  Advances in Neural Information
             Processing Systems, Vol.30. 2017. 4077−4087.
         [44]     Varghese NV, Mahmoud QH. A survey of multi-task deep reinforcement learning. Electronics, 2020, 9(9): 1363.
         [45]     Khetarpal K, Riemer M, Rish I, et al. Towards continual reinforcement learning: A review and perspectives. Journal of Artificial
             Intelligence Research, 2022, 75: 1401−1476.
         [46]     Wang M, Deng W. Deep visual domain adaptation: A survey. Neurocomputing, 2018, 312: 135−153.
         [47]     Zhou K, Liu Z, Qiao Y, et al. Domain generalization in vision: A survey. arXiv:2103.02503, 2021.
   61   62   63   64   65   66   67   68   69   70   71