Page 344 - 《软件学报》2025年第4期

P. 344

1750 软件学报 2025 年第 36 卷第 4 期

References:
[1] Deitke M, Batra D, Bisk Y, et al. Retrospectives on the embodied AI workshop. arXiv:2210.06849, 2022.
[2] Liu HP, Guo D, Sun FC, Zhang XY. Morphology-based embodied intelligence: Historical retrospect and research progress. Acta
Automatica Sinica, 2023, 49(6): 1131–1154 (in Chinese with English abstract). [doi: 10.16383/j.aas.c220564]
[3] Sima SL, Huang Y, He KJ, An D, Yuan H, Wang L. Recent advances in vision-and-language navigation. Acta Automatica Sinica, 2023,
49(1): 1–14 (in Chinese with English abstract). [doi: 10.16383/j.aas.c210352]
[4] Chang A, Dai A, Funkhouser T, Halber M, Niebner M, Savva M, Song SR, Zeng A, Zhang YD. Matterport3D: Learning from RGB-D
data in indoor environments. In: Proc. of the 2017 Int’l Conf. on 3D Vision (3DV). Qingdao: IEEE, 2017. 667−676. [doi: 10.1109/
3DV.2017.00081]
[5] Xia F, Zamir AR, He ZY, Sax A, Malik J, Savarese S. Gibson env: Real-world perception for embodied agents. In: Proc. of the 2018
IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 9068–9079. [doi: 10.1109/CVPR.2018.
00945]
[6] Ramakrishnan SK, Gokaslan A, Wijmans E, Maksymets O, Clegg A, Turner J, Undersander E, Galuba W, Westbury A, Chang A, Savva
M, Zhao YL, Batra D. Habitat-Matterport 3D dataset (HM3D): 1 000 large-scale 3D environments for embodied AI. In: Proc. of the
37th Int’l Conf. on Neural Information Processing Systems. NIPS, 2021.

[7] Kolve E, Mottaghi R, Han W, VanderBilt E, Weihs L, Herrasti A, Deitke M, Ehsani K, Gordon D, Zhu YK, Kembhavi A, Gupta A,
Farhadi A. AI2-THOR: An interactive 3D environment for visual AI. arXiv:1712.05474, 2017.
[8] Shen BK, Xia F, Li CS, Martín-Martín R, Fan LX, Wang GZ, Pérez-D’Arpino C, Buch S, Srivastava S, Tchapmi L, Tchapmi M, Vainio
K, Wong J, Fei-Fei L, Savarese S. iGibson 1.0: A simulation environment for interactive tasks in large realistic scenes. In: Proc. of the
2021 IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems. Prague: IEEE, 2021. 7520–7527. [doi: 10.1109/IROS51168.2021.
9636667]
[9] Li CS, Xia F, Martín-Martín R, Lingelbach M, Srivastava S, Shen BK, Vainio KE, Gokmen C, Dharan G, Jain T, Kurenkov A, Liu KR,
Gweon H, Wu JJ, Fei-Fei L, Savares S. iGibson 2.0: Object-centric simulation for robot learning of everyday household tasks. In: Proc.
of the 5th Conf. on Robot Learning. London: PMLR, 2022. 455–465.
[10] Savva M, Kadian A, Maksymets O, Savva M, Kadian A, Maksymets O, Zhao YL, Wijmans E, Jain B, Straub J, Liu J, Koltun V, Malik
J, Parikh D, Batra Dhruv. Habitat: A platform for embodied AI research. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer
Vision. Seoul: IEEE, 2019. 9338–9346. [doi: 10.1109/ICCV.2019.00943]
[11] Szot A, Clegg A, Undersander E, Wijmans E, Zhao YL, Turner J, Maestre N, Mukadam M, Chaplot D, Maksymets O, Gokaslan A,
Vondrus V, Dharur S, Meier F, Galuba W, Chang A, Kira Z, Koltun V, Malik J, Savva M, Batra D. Habitat 2.0: Training home
assistants to rearrange their habitat. In: Proc. of the 35th Int’l Conf. on Neural Information Processing Systems. Curran Associates Inc.,
2021. 20.
[12] Batra D, Gokaslan A, Kembhavi A, Maksymets O, Mottaghi R, Savva M, Toshev A, Wijmans E. ObjectNav revisited: On evaluation of
embodied agents navigating to objects. arXiv:2006.13171, 2020.
[13] Weihs L, Deitke M, Kembhavi A, Mottaghi R. Visual room rearrangement. In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision
and Pattern Recognition. Nashville: IEEE, 2021. 5918–5927. [doi: 10.1109/CVPR46437.2021.00586]
[14] Ramrakhya R, Undersander E, Batra D, Das A. Habitat-Web: Learning embodied object-search strategies from human demonstrations at
scale. In: Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 5163–5173. [doi:
10.1109/CVPR52688.2022.00511]
[15] Ramrakhya R, Batra D, Wijmans E, Das A. PIRLNav: Pretraining with imitation and RL finetuning for ObjectNav. In: Proc. of the 2023
IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. 17896–17906. [doi: 10.1109/CVPR52729.2023.
01716]
[16] Dang RH, Chen L, Wang LY, He ZT, Liu CJ, Chen QJ. Multiple thinking achieving meta-ability decoupling for object navigation. In:
Proc. of the 40th Int’l Conf. on Machine Learning. Honolulu: PMLR, 2023. 6855–6872.
[17] Gervet T, Chintala S, Batra D, Malik J, Chaplot DS. Navigating to objects in the real world. Science Robotics, 2023, 8(79): eadf6991.
[doi: 10.1126/scirobotics.adf6991]
[18] Liang YQ, Chen BY, Song SR. SSCNav: Confidence-aware semantic scene completion for visual semantic navigation. In: Proc. of the
2021 IEEE Int’l Conf. on Robotics and Automation. Xi’an: IEEE, 2021. 13194–13200. [doi: 10.1109/ICRA48506.2021.9560925]
[19] Georgakis G, Bucher B, Schmeckpeper K, Singh S, Daniilidis K. Learning to map for active semantic goal navigation. In: Proc. of the
10th Int’l Conf. on Learning Representations. ICLR, 2022.
[20] Ramakrishnan SK, Chaplot DS, Al-Halah Z, Malik J, Grauman K. PONI: Potential functions for objectgoal navigation with interaction-

339 340 341 342 343 344 345 346 347 348 349