Page 72 - 《软件学报》2024年第4期
P. 72

1650                                                       软件学报  2024 年第 35 卷第 4 期

        [195]     Savva M, Kadian A, Maksymets O, et al. Habitat: A platform for embodied ai research. In: Proc. of the IEEE/CVF Int’l Conf. on
             Computer Vision. IEEE, 2019. 9339−9347.
        [196]     Wortsman M, Ehsani K, Rastegari M, et al. Learning to learn how to learn: Self-adaptive visual navigation using meta-learning. In:
             Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2019. 6750−6759.
        [197]     Yan L, Liu D, Song Y, et al. Multimodal aggregation approach for memory vision-voice indoor navigation with meta-learning. In:
             Proc. of the IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems. 2020. 5847−5854.
        [198]     Luo Q, Sorokin M, Ha S. A few shot adaptation of visual navigation skills to new observations using meta-learning. In: Proc. of the
             IEEE Int’l Conf. on Robotics and Automation. 2021. 13231−13237.
        [199]     Hu Y, Chen M, Saad W, et al. Distributed multi-agent meta learning for trajectory design in wireless drone networks. IEEE Journal
             on Selected Areas in Communications, 2021, 39(10): 3177−3192.
        [200]     Wen S, Wen Z, Zhang D,  et al. A multi-robot path-planning algorithm  for autonomous navigation using meta-reinforcement
             learning based on transfer learning. Applied Soft Computing, 2021, 110: 107605.
        [201]     Yu Q, Luo  L,  Liu B,  et al. Re-planning  of  quadrotors  under disturbance based on meta  reinforcement learning.  Journal of
             Intelligent & Robotic Systems, 2023, 107(1): Article Number 13.
        [202]     Nagabandi A, Clavera I, Liu S, et al. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning.
             In: Proc. of the Int’l Conf. on Learning Representations. 2019.
        [203]     Song X, Yang Y, Choromanski K, et al. Rapidly adaptable legged robots via evolutionary meta-learning. In: Proc. of the IEEE/RSJ
             Int’l Conf. on Intelligent Robots and Systems. 2020. 3769−3776.
        [204]     Asayesh S, Chen M, Mehrandezh M, et al. Least-restrictive multi-agent collision avoidance via deep meta reinforcement learning
             and optimal control. In: Proc. of the Robot Intelligence Technology and Applications. 2023. 213−225.
        [205]     Sun Y, Zhang Y. Conversational recommender system. In: Proc. of the Int’l ACM SIGIR Conf. on Research and Development in
             Information Retrieval. 2018. 235−244.
        [206]     Lei W,  He X,  Miao Y,  et al. Estimation-action-reflection: Towards  deep interaction between conversational and  recommender
             systems. In: Proc. of the Int’l Conf. on Web Search and Data Mining. 2020. 304−312.
        [207]     Deng Y, Li Y, Sun F, et al. Unified conversational recommendation policy learning via graph-based reinforcement learning. In:
             Proc. of the Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval. 2021. 1431−1441.
        [208]     Zou  L, Xia  L, Gu Y, et al. Neural interactive collaborative  filtering.  In:  Proc. of the  Int’l ACM SIGIR Conf. on Research and
             Development in Information Retrieval. 2020. 749−758.
        [209]     Chu Z, Wang H, Xiao Y, et al. Meta policy learning for cold-start conversational recommendation. In: Proc. of the ACM Int’l Conf.
             on Web Search and Data Mining. Association for Computing Machinery, 2023. 222−230.
         附中文参考文献:
           [17]  李凡长,  刘洋,  吴鹏翔,  等.  元学习研究综述.  计算机学报, 2021, 44(2): 422−446.
           [18]  谭晓阳,  张哲.  元强化学习综述.  南京航空航天大学学报, 2021, 53(5): 653−663.
           [21]  赵春宇,  赖俊.  元强化学习综述.  计算机应用研究, 2023, 40(1): 1−10.
           [82]   陆嘉猷,  凌兴宏,  刘全,  等.  基于自适应调节策略熵的元强化学习算法.  计算机科学, 2021, 48(6): 168−174.
           [98]  聂凯,  孟庆海.  基于层次情节性元强化学习的对抗行为评估.  指挥控制与仿真, 2021, 43(2): 65−71.
         [152]  吴少波,  傅启明,  陈建平,  吴宏杰,  陆悠.  基于相对熵的元逆强化学习方法.  计算机科学, 2021,48(9):257−263.



                       陈奕宇(1998-),  男,  博士生,  CCF 学生                丁天雨(1992-),  男,  博士,  高级研究员,
                       会员,  主要研究领域为元强化学习,  机器                       主要研究领域为深度表示学习,  优化与
                       人控制.                                         计算机视觉.



                       霍静(1989-),  女,  博士,  准聘副教授,                  高阳(1972-),  男,  博士,  教授, CCF 杰出
                       CCF 专业会员,  主要研究领域为机器学                        会员,  主要研究领域为人工智能,  机器学
                       习,  计算机视觉,  具身智能.                            习,  智能系统.
   67   68   69   70   71   72   73   74   75   76   77