Page 346 - 《软件学报》2025年第4期

P. 346

1752 软件学报 2025 年第 36 卷第 4 期

[43] Mousavian A, Toshev A, Fišer M, Košecká J, Wahid A, Davidson J. Visual representations for semantic target driven navigation. In:
Proc. of the 2019 Int’l Conf. on Robotics and Automation. Montreal: IEEE, 2019. 8846–8852. [doi: 10.1109/ICRA.2019.8793493]
[44] Wu Y, Wu YX, Tamar A, Russell S, Gkioxari G, Tian YD. Bayesian relational memory for semantic visual navigation. In: Proc. of the
2019 IEEE/CVF Int’l Conf. on Computer Vision. Seoul: IEEE, 2019. 2769–2779. [doi: 10.1109/ICCV.2019.00286]
[45] Chaplot DS, Gandhi D, Gupta A, Salakhutdinov R. Object goal navigation using goal-oriented semantic exploration. In: Proc. of the
34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 357.
[46] Maksymets O, Cartillier V, Gokaslan A, Wijmans E, Galuba W, Lee S, Batra D. THDA: Treasure hunt data augmentation for semantic
navigation. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision. Montreal: IEEE, 2021. 15354–15363. [doi: 10.1109/
ICCV48922.2021.01509]
[47] Deitke M, Vander Bilt E, Herrasti A, Weihs L, Salvador J, Ehsani K, Han W, Kolve E, Farhadi A, Kembhavi A, Mottaghi R.
ProcTHOR: Large-scale embodied AI using procedural generation. In: Proc. of the 36th Int’l Conf. on Neural Information Processing
Systems. New Orleans: Curran Associates Inc., 2022. 433.
[48] Zhou K, Zhang HY, Li F. TransNav: Spatial sequential Transformer network for visual navigation. Journal of Computational Design and
Engineering, 2022, 9(5): 1866–1878. [doi: 10.1093/jcde/qwac084]
[49] Li F, Guo C, Zhang HY, Luo BH. Context vector-based visual mapless navigation in indoor using hierarchical semantic information and
meta-learning. Complex & Intelligent Systems, 2023, 9(2): 2031–2041. [doi: 10.1007/s40747-022-00902-7]

[50] Yin J, Zhang ZD, Gao YH, Yang ZW, Li L, Xiao M, Sun YQ, Yan CG. Survey on vision-language pre-training. Ruan Jian Xue
Bao/Journal of Software, 2023, 34(5): 2000–2023 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6774.htm [doi:
10.13328/j.cnki.jos.006774]
[51] Du PF, Li XY, Gao YL. Survey on multimodal visual language representation learning. Ruan Jian Xue Bao/Journal of Software, 2021,
32(2): 327–348 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6125.htm [doi: 10.13328/j.cnki.jos.006125]
[52] Gadre SY, Wortsman M, Ilharco G, Schmidt L, Song SR. CoWs on pasture: Baselines and benchmarks for language-driven zero-shot
object navigation. In: Proc. of the 2023 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023.
23171–23181. [doi: 10.1109/CVPR52729.2023.02219]
[53] Majumdar A, Aggarwal G, Devnani B, Hoffman J, Batra D. ZSON: Zero-shot object-goal navigation using multimodal goal
embeddings. In: Proc. of the 36th Int’l Conf. on Neural Information Processing Systems. New Orleans: Curran Associates Inc., 2022.
2343.
[54] Ramakrishnan SK, Jayaraman D, Grauman K. An exploration of embodied visual exploration. Int’l Journal of Computer Vision, 2021,
129(5): 1616–1649. [doi: 10.1007/s11263-021-01437-z]
[55] Zhang TY, Hu XG, Xiao J, Zhang GF. A survey of visual navigation: From geometry to embodied AI. Engineering Applications of
Artificial Intelligence, 2022, 114: 105036. [doi: 10.1016/j.engappai.2022.105036]
[56] Duan JF, Yu S, Tan HL, Zhu HY, Tan C. A survey of embodied AI: From simulators to research tasks. IEEE Trans. on Emerging Topics
in Computational Intelligence, 2022, 6(2): 230–244. [doi: 10.1109/TETCI.2022.3141105]
[57] Gu J, Stefani E, Wu Q, Thomason J, Wang X. Vision-and-language navigation: A survey of tasks, methods, and future directions. In:
Proc. of the 60th Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers). Dublin: ACL, 2022.
7606–7623. [doi: 10.18653/v1/2022.acl-long.524]
[58] Cao C, Zhu H, Ren Z, Choset H, Zhang J. Representation granularity enables time-efficient autonomous exploration in large, complex
worlds. Science Robotics, 2023, 8(80): eadf0970. [doi: 10.1126/scirobotics.adf0970]
[59] Garaffa LC, Basso M, Konzen AA, de Freitas EP. Reinforcement learning for mobile robotics exploration: A survey. IEEE Trans. on
Neural Networks and Learning Systems, 2023, 34(8): 3796–3810. [doi: 10.1109/TNNLS.2021.3124466]
[60] Wang L, Qi Y, He BB, Zhang YJ, Xu YC. Survey of autonomous exploration algorithms for robots. Journal of Computer Applications,
2023, 43(S1): 314–322 (in Chinese with English abstract). [doi: 10.11772/j.issn.1001-9081.2022111706]
[61] Zhang SY, Zhang XB, Yuan J, Fang YC. A survey on coverage and exploration path planning with multi-rotor micro aerial vehicles.
Control and Decision, 2022, 37(3): 513–529 (in Chinese with English abstract). [doi: 10.13195/j.kzyjc.2021.1751]
[62] Fang K, Toshev A, Fei-Fei L, Savarese S. Scene memory Transformer for embodied agents in long-horizon tasks. In: Proc. of the 2019
IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 538–547. [doi: 10.1109/CVPR.2019.00063]
[63] Fortunato M, Tan M, Faulkner R, Hansen S, Badia AP, Buttimore G, Deck C, Leibo JZ, Blundell C. Generalization of reinforcement
learners with working and episodic memory. In: Proc. of the 33rd Int’l Conf. on Neural Information Processing Systems. Vancouver:
Curran Associates Inc., 2019. 1117.
[64] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proc. of the
31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.

341 342 343 344 345 346 347 348 349 350 351