Page 70 - 《软件学报》2024年第4期
P. 70
1648 软件学报 2024 年第 35 卷第 4 期
[139] Liu EZ, Raghunathan A, Liang P, et al. Decoupling exploration and exploitation for meta-reinforcement learning without sacrifices.
In: Proc. of the Int’l Conf. on Machine Learning. 2021. 6925−6935.
[140] Lin Z, Thomas G, Yang G, et al. Model-based adversarial meta-reinforcement learning. In: Advances in Neural Information
Processing Systems, Vol.33. 2020. 10161−10173.
[141] Lee S, Chung SY. Improving generalization in meta-RL with imaginary tasks from latent dynamics mixture. In: Advances in Neural
Information Processing Systems, Vol.34. 2021. 27222−27235.
[142] Xiong Z, Zintgraf L, Beck J, et al. On the practical consistency of meta-reinforcement learning algorithms. arXiv: 2112.00478,
2021.
2
[143] Fu Q, Wang Z, Fang N, et al. MAML : Meta reinforcement learning via meta-learning for task categories. Frontiers of Computer
Science, Springer, 2023, 17(4): 174325.
[144] Wang M, Bing Z, Yao X, et al. Meta-reinforcement learning based on self-supervised task representation learning. Proc. of the
AAAI Conf. on Artificial Intelligence, 2023, 37(8): 10157−10165.
[145] Packer C, Abbeel P, Gonzalez JE. Hindsight task relabelling: Experience replay for sparse reward meta-RL. In: Advances in Neural
Information Processing Systems, Vol.34. 2021. 2466−2477.
[146] Guo Y, Wu Q, Lee H. Learning action translator for meta reinforcement learning on sparse-reward tasks. Proc. of the AAAI Conf.
on Artificial Intelligence, 2022, 36(6): 6792−6800.
[147] Lee K, Lee K, Shin J, et al. Network randomization: A simple technique for generalization in deep reinforcement learning. In: Proc.
of the Int’l Conf. on Learning Representations. 2020.
[148] Laskin M, Lee K, Stooke A, et al. Reinforcement learning with augmented data. In: Advances in Neural Information Processing
Systems, Vol.33. 2020. 19884−19895.
[149] Hansen N, Wang X. Generalization in reinforcement learning by soft data augmentation. In: Proc. of the IEEE Int’l Conf. on
Robotics and Automation. 2021. 13611−13617.
[150] Dorfman R, Shenfeld I, Tamar A. Offline meta reinforcement learning—Identifiability challenges and effective data collection
strategies. In: Advances in Neural Information Processing Systems, Vol.34. 2021. 4607−4618.
[151] Li J, Vuong Q, Liu S, et al. Multi-task batch reinforcement learning with metric learning. In: Advances in Neural Information
Processing Systems, Vol.33. 2020. 6197−6210.
[152] Wu SB, Fu QM, Chen JP, et al. Meta-inverse reinforcement learning method based on relative entropy. Computer Science,
2021,48(9):257−263 (in Chinese with English abstract).
[153] Li L, Yang R, Luo D. FOCAL: Efficient fully-offline meta-reinforcement learning via distance metric learning and behavior
regularization. In: Proc. of the Int’l Conf. on Learning Representations. 2021.
[154] Lin S, Wan J, Xu T, et al. Model-based offline meta-reinforcement learning with regularization. In: Proc. of the Int’l Conf. on
Learning Representations. 2022.
[155] Luo M, Balakrishna A, Thananjeyan B, et al. MESA: Offline meta-RL for safe adaptation and fault tolerance. In: Proc. of the
Workshop at the Conf. on Neural Information Processing Systems. 2021.
[156] Yuan H, Lu Z. Robust task representations for offline meta-reinforcement learning via contrastive learning. In: Proc. of the Int’l
Conf. on Machine Learning. 2022. 25747−25759.
[157] Nam T, Sun SH, Pertsch K, et al. Skill-based meta-reinforcement learning. In: Proc. of the Int’l Conf. on Learning Representations.
2022.
[158] Mendonca R, Gupta A, Kralev R, et al. Guided meta-policy search. In: Advances in Neural Information Processing Systems, Vol.32.
2019.
[159] Zhou A, Jang E, Kappler D, et al. Watch, try, learn: Meta-learning from demonstrations and reward. In: Proc. of the Int’l Conf. on
Learning Representations. 2020.
[160] Rengarajan D, Chaudhary S, Kim J, et al. Enhanced meta reinforcement learning using demonstrations in sparse reward
environments. In: Advances in Neural Information Processing Systems. 2022.
[161] Bhutani V, Majumder A, Vankadari M, et al. Attentive one-shot meta-imitation learning from visual demonstration. In: Proc. of the
IEEE Int’l Conf. on Robotics and Automation. IEEE, 2022. 8584−8590.
[162] Caccia M, Mueller J, Kim T, et al. Task-agnostic continual reinforcement learning: In praise of a simple baseline.
arXiv:2205.14495, 2022.
[163] Kessler S, Miłoś P, Parker-Holder J, et al. The surprising effectiveness of latent world models for continual reinforcement learning.
arXiv:2211.15944, 2022.
[164] Iqbal S, Sha F. Actor-attention-critic for multi-agent reinforcement learning. In: Proc. of the Int’l Conf. on Machine Learning. 2019.
2961−2970.
[165] Singh A, Jain T, Sukhbaatar S. Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: Proc.
of the Int’l Conf. on Learning Representations. 2019.
[166] Koul A. Ma-Gym: Collection of Multi-Agent Environments based on Openai GYM. GitHub: GitHub Repository, 2019.