Page 309 - 《软件学报》2025年第4期
P. 309
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
2025,36(4):1715−1757 [doi: 10.13328/j.cnki.jos.007250] [CSTR: 32375.14.jos.007250] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
*
面向具身人工智能的物体目标导航综述
陈铂垒, 康嘉绪, 钟 萍, 崔永正, 卢思怡, 杨昊楠, 王建新
(中南大学 计算机学院, 湖南 长沙 410083)
通信作者: 钟萍, E-mail: ping.zhong@csu.edu.cn
摘 要: 近年来随着计算机视觉和人工智能领域的不断发展, 具身人工智能 (embodied AI) 受到国内外学术界和工
业界的广泛关注. 具身人工智能强调具身智能体通过与环境进行情景化的交互来主动获取物理世界的真实反馈,
并通过对反馈进行学习使具身智能体更加智能. 作为具身人工智能具体化的任务之一, 物体目标导航要求具身智
能体在事先未知的、复杂且语义丰富的场景中搜寻并导航至指定的物体目标 (例如: 找到水槽). 物体目标导航在
辅助人类日常活动的智能助手方面有着巨大的应用潜力, 是其他基于交互的具身智能研究的基础和前置任务. 系
统地分类和梳理当前物体目标导航相关工作, 首先介绍环境表示和视觉自主探索相关知识, 从 3 种不同的角度对
现有的物体目标导航方法进行分类和分析, 其次介绍两类更高层次的物体重排布任务, 描述逼真的室内仿真环境
数据集、评价指标和通用的导航策略训练范式, 最后比较和分析现有的物体目标导航策略在不同数据集上的性能,
总结该领域所面临的挑战, 并对发展前景作出展望.
关键词: 物体目标导航; 具身人工智能; 视觉自主探索; 视觉物体重排布
中图法分类号: TP18
中文引用格式: 陈铂垒, 康嘉绪, 钟萍, 崔永正, 卢思怡, 杨昊楠, 王建新. 面向具身人工智能的物体目标导航综述. 软件学报, 2025,
36(4): 1715–1757. http://www.jos.org.cn/1000-9825/7250.htm
英文引用格式: Chen BL, Kang JX, Zhong P, Cui YZ, Lu SY, Yang HN, Wang JX. Survey on Object Goal Navigation for Embodied
AI. Ruan Jian Xue Bao/Journal of Software, 2025, 36(4): 1715–1757 (in Chinese). http://www.jos.org.cn/1000-9825/7250.htm
Survey on Object Goal Navigation for Embodied AI
CHEN Bo-Lei, KANG Jia-Xu, ZHONG Ping, CUI Yong-Zheng, LU Si-Yi, YANG Hao-Nan, WANG Jian-Xin
(School of Computer Science and Engineering, Central South University, Changsha 410083, China)
Abstract: With the continuous development of computer vision and artificial intelligence (AI) in recent years, embodied AI has received
widespread attention from academia and industry at home and abroad. Embodied AI emphasizes that an agent should actively obtain real
feedback from the physical world by interacting with the environment in a contextualized way and make itself more intelligent through
learning from the feedback. As one of the concrete tasks of embodied AI, object goal navigation requires an agent to search for and
navigate to a specified object goal (e.g., find a sink) in a previously unknown, complex, and semantically rich scenario. Object goal
navigation has great potential for applications in smart assistants that support daily human activities, serving as a fundamental and
antecedent task for other interaction-based embodied AI research. This study systematically classifies current research on object goal
navigation. Firstly, the knowledge related to environmental representation and autonomous visual exploration is introduced, and existing
object goal navigation methods are classified and analyzed from three different perspectives. Secondly, two categories of higher-level
object rearrangement tasks are introduced, with a description of datasets for realistic indoor environment simulation, evaluation metrics, and
a generic training paradigm for navigation strategies. Finally, the performance of existing object goal navigation strategies is compared and
analyzed on different datasets. The challenges in this field are summarized, and development trends are predicted.
Key words: object goal navigation; embodied AI; autonomous visual exploration; visual object rearrangement
* 基金项目: 国家自然科学基金 (62172443); 湖南省自然科学基金 (2022JJ30760); 长沙市自然科学基金 (kq2202107, kq2202108)
收稿时间: 2023-05-29; 修改时间: 2023-10-08, 2024-05-14; 采用时间: 2024-07-15; jos 在线出版时间: 2024-11-27
CNKI 网络首发时间: 2024-11-28