Page 68 - 《软件学报》2024年第4期
P. 68

1646                                                       软件学报  2024 年第 35 卷第 4 期

         [79]     Zintgraf  L,  Shiarlis  K,  Kurin  V,  et al. Fast context adaptation via meta-learning.  In: Proc. of the  36th  Int’l  Conf.  on Machine
             Learning (ICML 2019). 2019, 2019 (2018). 13262−13276.
         [80]    Lan L, Li Z, Guan X, Wang P. Meta reinforcement learning with task embedding and shared policy. In: Proc. of the Int’l Joint Conf.
             on Artificial Intelligence. 2019. 2794−2800.
         [81]     Humplik J, Galashov A, Hasenclever L, et al. Meta reinforcement learning as task inference. arXiv:1905. 06424, 2019.
         [82]     Lu JY,  Ling XH,  Liu  Q,  et al.  Meta-reinforcement learning algorithm based on automating policy entropy.  Computer Science,
             2021,48(6):168−174 (in Chinese with English abstract).
         [83]     Raileanu R, Goldstein M, Szlam A, et al. Fast adaptation to new environments via policy-dynamics value functions. In: Proc. of the
             Int’l Conf. on Machine Learning. 2020. 7920−7931.
         [84]     Zhang A, Sodhani S, Khetarpal K, et al. Learning robust state abstractions for hidden-parameter block mdps. In: Proc. of the Int’l
             Conf. on Learning Representations. 2021.
         [85]     van den Oord A, Li Y, Vinyals O. Representation learning with contrastive predictive coding. 2018.
         [86]     He K, Fan H, Wu Y, et al. Momentum contrast for unsupervised visual representation learning. In: Proc. of the IEEE Computer
             Society Conf. on Computer Vision and Pattern Recognition. 2020. 9726−9735.
         [87]     Laskin M, Srinivas A, Abbeel P. CURL: Contrastive unsupervised representations for reinforcement learning. In: Proc. of the 37th
             Int’l Conf. on Machine Learning (ICML 2020). 2020, PartF16814(2018). 5595−5606.
         [88]     Fu H, Tang H, Hao J, et al. Towards effective context for meta-reinforcement learning: An approach based on contrastive learning.
             Proc. of the AAAI Conf. on Artificial Intelligence, 2021, 35(8): 7457−7465.
         [89]     Wang B, Xu S, Keutzer K, et al. Improving context-based meta-reinforcement learning with self-supervised trajectory contrastive
             learning. arXiv:2103.06386, 2021.
         [90]     Mu Y, Zhuang Y, Ni F,  et al. DOMINO:  Decomposed  mutual information optimization for generalized context in
             meta-reinforcement learning. In: Advances in Neural Information Processing Systems, Vol.35. 2022. 27563−27575.
         [91]     Raghu A, Raghu M, Bengio S, et al. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In: Proc. of
             the Int’l Conf. on Learning Representations. 2020.
         [92]     Kao CH, Chiu WC, Chen PY. MAML is a noisy contrastive learner. In: Proc. of the Int’l Conf. on Learning Representations. 2022.
         [93]     Hutsebaut-Buysse M,  Mets  K,  Latré S. Hierarchical reinforcement learning:  A  survey  and open research challenges. Machine
             Learning and Knowledge Extraction, 2022, 4(1): 172−221.
         [94]     Tessler C, Givony S, Zahavy T, et al. A deep hierarchical approach to lifelong learning in minecraft. Proc. of the AAAI Conf. on
             artificial intelligence. arXiv:1604.07255, 2017.
         [95]     Fu  H, Tang H,  Hao J,  et al. MGHRL:  Meta  goal-generation for hierarchical  reinforcement  learning.  Distributed  Artificial
             Intelligence. Vol.12547. Springer, 2020. 29−39.
         [96]     Li J, Wang X, Tang S, et al. Unsupervised reinforcement learning of transferable meta-skills for embodied navigation. In: Proc. of
             the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2020. 12120−12129.
         [97]     Lu J, Salvador J, Mottaghi R, et al. ASC me to do anything: Multi-task training for embodied AI. arXiv:2202.06987, 2022.
         [98]     Nie K, Meng QH. Combat behavior evaluation based on hierarchical episodic meta-deep reinforcement learning. Command Control
             & Simulation, 2021, 43(2): 65−71(in Chinese with English abstract).
         [99]     Sohn S, Woo H, Choi J, et al. Meta reinforcement learning with autonomous inference of subtask dependencies. In: Proc. of the
             Int’l Conf. on Learning Representations. 2020.
        [100]     Peng M, Zhu B, Jiao J. Linear representation meta-reinforcement learning for instant adaptation. arXiv:2101.04750, 2021.
        [101]     Chua K, Lei Q, Lee J. Provable hierarchy-based meta-reinforcement learning. In: Ruiz F, Dy J, Van de Meent JW, eds. Proc. of the
             26th Int’l Conf. on Artificial Intelligence and Statistics, Vol.206. 2023. 10918−10967.
        [102]     Amin S, Gomrokchi M, Satija H, et al. A survey of exploration methods in reinforcement learning. arXiv:2109. 00157, 2021.
        [103]     Stadie BC, Yang G, Houthooft R, et al. Some considerations on learning to explore via meta-reinforcement learning. In: Advances
             in Neural Information Processing Systems. 2018. 9280−9290.
        [104]     Gurumurthy S, Kumar S, Sycara K. MAME : Model-agnostic meta-exploration. In: Proc. of the Conf. on Robot Learning. 2020.
             910−922.
        [105]     Xu T, Liu Q, Zhao L, et al. Learning to explore via meta-policy gradient. In: Proc. of the Int’l Conf. on Machine Learning. 2018.
             5463−5472.
        [106]     Gupta A, Mendonca R, Liu YX, et al. Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural
             Information Processing Systems, Vol.31. 2018. 5302−5311.
        [107]     Alet  F, Schneider MF, Lozano-Perez T,  et al. Meta-learning  curiosity algorithms.  In: Proc. of the  Int’l  Conf.  on Learning
             Representations. 2020.
        [108]     Hu H, Huang G, Li X, et al. Meta-reinforcement learning with dynamic adaptiveness distillation. IEEE Trans. on Neural Networks
             and Learning Systems, 2023, 34(3): 1454−1464.
   63   64   65   66   67   68   69   70   71   72   73