Page 70 - 《软件学报》2024年第4期
P. 70

1648                                                       软件学报  2024 年第 35 卷第 4 期

        [139]     Liu EZ, Raghunathan A, Liang P, et al. Decoupling exploration and exploitation for meta-reinforcement learning without sacrifices.
             In: Proc. of the Int’l Conf. on Machine Learning. 2021. 6925−6935.
        [140]     Lin Z,  Thomas G, Yang G,  et al. Model-based  adversarial meta-reinforcement learning.  In:  Advances in Neural  Information
             Processing Systems, Vol.33. 2020. 10161−10173.
        [141]     Lee S, Chung SY. Improving generalization in meta-RL with imaginary tasks from latent dynamics mixture. In: Advances in Neural
             Information Processing Systems, Vol.34. 2021. 27222−27235.
        [142]     Xiong Z, Zintgraf  L, Beck J, et al. On the practical consistency of meta-reinforcement learning algorithms. arXiv: 2112.00478,
             2021.
                                      2
        [143]     Fu Q, Wang Z, Fang N, et al. MAML : Meta reinforcement learning via meta-learning for task categories. Frontiers of Computer
             Science, Springer, 2023, 17(4): 174325.
        [144]     Wang M, Bing Z, Yao X, et al. Meta-reinforcement learning based on self-supervised task representation learning. Proc. of the
             AAAI Conf. on Artificial Intelligence, 2023, 37(8): 10157−10165.
        [145]     Packer C, Abbeel P, Gonzalez JE. Hindsight task relabelling: Experience replay for sparse reward meta-RL. In: Advances in Neural
             Information Processing Systems, Vol.34. 2021. 2466−2477.
        [146]     Guo Y, Wu Q, Lee H. Learning action translator for meta reinforcement learning on sparse-reward tasks. Proc. of the AAAI Conf.
             on Artificial Intelligence, 2022, 36(6): 6792−6800.
        [147]     Lee K, Lee K, Shin J, et al. Network randomization: A simple technique for generalization in deep reinforcement learning. In: Proc.
             of the Int’l Conf. on Learning Representations. 2020.
        [148]     Laskin M, Lee K, Stooke A, et al. Reinforcement learning with augmented data. In: Advances in Neural Information Processing
             Systems, Vol.33. 2020. 19884−19895.
        [149]     Hansen N, Wang X. Generalization in reinforcement  learning by  soft  data augmentation.  In: Proc. of  the  IEEE  Int’l  Conf.  on
             Robotics and Automation. 2021. 13611−13617.
        [150]     Dorfman R,  Shenfeld  I,  Tamar  A. Offline meta reinforcement  learning—Identifiability challenges and effective  data  collection
             strategies. In: Advances in Neural Information Processing Systems, Vol.34. 2021. 4607−4618.
        [151]     Li J,  Vuong  Q, Liu S,  et al.  Multi-task  batch reinforcement learning with metric learning.  In:  Advances in Neural Information
             Processing Systems, Vol.33. 2020. 6197−6210.
        [152]     Wu SB,  Fu QM,  Chen JP,  et al.  Meta-inverse reinforcement learning method based on relative entropy.  Computer Science,
             2021,48(9):257−263 (in Chinese with English abstract).
        [153]     Li  L, Yang R,  Luo D. FOCAL:  Efficient  fully-offline meta-reinforcement learning via  distance metric learning and behavior
             regularization. In: Proc. of the Int’l Conf. on Learning Representations. 2021.
        [154]     Lin S, Wan J, Xu T, et al. Model-based offline meta-reinforcement learning with regularization. In: Proc. of the Int’l Conf. on
             Learning Representations. 2022.
        [155]     Luo M,  Balakrishna  A,  Thananjeyan  B,  et al. MESA:  Offline  meta-RL  for safe adaptation and fault tolerance.  In: Proc. of the
             Workshop at the Conf. on Neural Information Processing Systems. 2021.
        [156]     Yuan H, Lu Z. Robust task representations for offline meta-reinforcement learning via contrastive learning. In: Proc. of the Int’l
             Conf. on Machine Learning. 2022. 25747−25759.
        [157]     Nam T, Sun SH, Pertsch K, et al. Skill-based meta-reinforcement learning. In: Proc. of the Int’l Conf. on Learning Representations.
             2022.
        [158]    Mendonca R, Gupta A, Kralev R, et al. Guided meta-policy search. In: Advances in Neural Information Processing Systems, Vol.32.
             2019.
        [159]     Zhou A, Jang E, Kappler D, et al. Watch, try, learn: Meta-learning from demonstrations and reward. In: Proc. of the Int’l Conf. on
             Learning Representations. 2020.
        [160]     Rengarajan  D,  Chaudhary  S, Kim  J,  et al. Enhanced meta reinforcement learning using demonstrations  in sparse reward
             environments. In: Advances in Neural Information Processing Systems. 2022.
        [161]     Bhutani V, Majumder A, Vankadari M, et al. Attentive one-shot meta-imitation learning from visual demonstration. In: Proc. of the
             IEEE Int’l Conf. on Robotics and Automation. IEEE, 2022. 8584−8590.
        [162]     Caccia M, Mueller J, Kim T,  et al. Task-agnostic  continual reinforcement learning:  In  praise of a  simple baseline.
             arXiv:2205.14495, 2022.
        [163]     Kessler S, Miłoś P, Parker-Holder J, et al. The surprising effectiveness of latent world models for continual reinforcement learning.
             arXiv:2211.15944, 2022.
        [164]    Iqbal S, Sha F. Actor-attention-critic for multi-agent reinforcement learning. In: Proc. of the Int’l Conf. on Machine Learning. 2019.
             2961−2970.
        [165]     Singh A, Jain T, Sukhbaatar S. Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: Proc.
             of the Int’l Conf. on Learning Representations. 2019.
        [166]     Koul A. Ma-Gym: Collection of Multi-Agent Environments based on Openai GYM. GitHub: GitHub Repository, 2019.
   65   66   67   68   69   70   71   72   73   74   75