Page 68 - 《软件学报》2024年第4期
P. 68
1646 软件学报 2024 年第 35 卷第 4 期
[79] Zintgraf L, Shiarlis K, Kurin V, et al. Fast context adaptation via meta-learning. In: Proc. of the 36th Int’l Conf. on Machine
Learning (ICML 2019). 2019, 2019 (2018). 13262−13276.
[80] Lan L, Li Z, Guan X, Wang P. Meta reinforcement learning with task embedding and shared policy. In: Proc. of the Int’l Joint Conf.
on Artificial Intelligence. 2019. 2794−2800.
[81] Humplik J, Galashov A, Hasenclever L, et al. Meta reinforcement learning as task inference. arXiv:1905. 06424, 2019.
[82] Lu JY, Ling XH, Liu Q, et al. Meta-reinforcement learning algorithm based on automating policy entropy. Computer Science,
2021,48(6):168−174 (in Chinese with English abstract).
[83] Raileanu R, Goldstein M, Szlam A, et al. Fast adaptation to new environments via policy-dynamics value functions. In: Proc. of the
Int’l Conf. on Machine Learning. 2020. 7920−7931.
[84] Zhang A, Sodhani S, Khetarpal K, et al. Learning robust state abstractions for hidden-parameter block mdps. In: Proc. of the Int’l
Conf. on Learning Representations. 2021.
[85] van den Oord A, Li Y, Vinyals O. Representation learning with contrastive predictive coding. 2018.
[86] He K, Fan H, Wu Y, et al. Momentum contrast for unsupervised visual representation learning. In: Proc. of the IEEE Computer
Society Conf. on Computer Vision and Pattern Recognition. 2020. 9726−9735.
[87] Laskin M, Srinivas A, Abbeel P. CURL: Contrastive unsupervised representations for reinforcement learning. In: Proc. of the 37th
Int’l Conf. on Machine Learning (ICML 2020). 2020, PartF16814(2018). 5595−5606.
[88] Fu H, Tang H, Hao J, et al. Towards effective context for meta-reinforcement learning: An approach based on contrastive learning.
Proc. of the AAAI Conf. on Artificial Intelligence, 2021, 35(8): 7457−7465.
[89] Wang B, Xu S, Keutzer K, et al. Improving context-based meta-reinforcement learning with self-supervised trajectory contrastive
learning. arXiv:2103.06386, 2021.
[90] Mu Y, Zhuang Y, Ni F, et al. DOMINO: Decomposed mutual information optimization for generalized context in
meta-reinforcement learning. In: Advances in Neural Information Processing Systems, Vol.35. 2022. 27563−27575.
[91] Raghu A, Raghu M, Bengio S, et al. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In: Proc. of
the Int’l Conf. on Learning Representations. 2020.
[92] Kao CH, Chiu WC, Chen PY. MAML is a noisy contrastive learner. In: Proc. of the Int’l Conf. on Learning Representations. 2022.
[93] Hutsebaut-Buysse M, Mets K, Latré S. Hierarchical reinforcement learning: A survey and open research challenges. Machine
Learning and Knowledge Extraction, 2022, 4(1): 172−221.
[94] Tessler C, Givony S, Zahavy T, et al. A deep hierarchical approach to lifelong learning in minecraft. Proc. of the AAAI Conf. on
artificial intelligence. arXiv:1604.07255, 2017.
[95] Fu H, Tang H, Hao J, et al. MGHRL: Meta goal-generation for hierarchical reinforcement learning. Distributed Artificial
Intelligence. Vol.12547. Springer, 2020. 29−39.
[96] Li J, Wang X, Tang S, et al. Unsupervised reinforcement learning of transferable meta-skills for embodied navigation. In: Proc. of
the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2020. 12120−12129.
[97] Lu J, Salvador J, Mottaghi R, et al. ASC me to do anything: Multi-task training for embodied AI. arXiv:2202.06987, 2022.
[98] Nie K, Meng QH. Combat behavior evaluation based on hierarchical episodic meta-deep reinforcement learning. Command Control
& Simulation, 2021, 43(2): 65−71(in Chinese with English abstract).
[99] Sohn S, Woo H, Choi J, et al. Meta reinforcement learning with autonomous inference of subtask dependencies. In: Proc. of the
Int’l Conf. on Learning Representations. 2020.
[100] Peng M, Zhu B, Jiao J. Linear representation meta-reinforcement learning for instant adaptation. arXiv:2101.04750, 2021.
[101] Chua K, Lei Q, Lee J. Provable hierarchy-based meta-reinforcement learning. In: Ruiz F, Dy J, Van de Meent JW, eds. Proc. of the
26th Int’l Conf. on Artificial Intelligence and Statistics, Vol.206. 2023. 10918−10967.
[102] Amin S, Gomrokchi M, Satija H, et al. A survey of exploration methods in reinforcement learning. arXiv:2109. 00157, 2021.
[103] Stadie BC, Yang G, Houthooft R, et al. Some considerations on learning to explore via meta-reinforcement learning. In: Advances
in Neural Information Processing Systems. 2018. 9280−9290.
[104] Gurumurthy S, Kumar S, Sycara K. MAME : Model-agnostic meta-exploration. In: Proc. of the Conf. on Robot Learning. 2020.
910−922.
[105] Xu T, Liu Q, Zhao L, et al. Learning to explore via meta-policy gradient. In: Proc. of the Int’l Conf. on Machine Learning. 2018.
5463−5472.
[106] Gupta A, Mendonca R, Liu YX, et al. Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural
Information Processing Systems, Vol.31. 2018. 5302−5311.
[107] Alet F, Schneider MF, Lozano-Perez T, et al. Meta-learning curiosity algorithms. In: Proc. of the Int’l Conf. on Learning
Representations. 2020.
[108] Hu H, Huang G, Li X, et al. Meta-reinforcement learning with dynamic adaptiveness distillation. IEEE Trans. on Neural Networks
and Learning Systems, 2023, 34(3): 1454−1464.