Page 304 - 《软件学报》2025年第4期
P. 304

1710                                                       软件学报  2025  年第  36  卷第  4  期


                    (4) MVS  公开数据集的数量有限且评价指标不一致. 现有的实拍数据集由于受到设备限制, 可用于训练测试
                 的数量有限同时场景也有局限性, 无法支撑具有更强泛化性的通用模型训练, 也无法获得如天空等难以重建区域
                 的标签. 而合成数据集虽然降低了采样的成本, 但无法真实地反映自然图像的光照效果和噪声. 同时, 多种类型的
                 数据集也造成了不同数据集下的评估指标不一致的问题. 如果能够创建更具有全面性、真实性和多样性的大规模
                 数据集供模型训练和测试, 预期三维重建的性能指标会进一步提升.
                    随着计算机技术的快速发展和模型的不断完善, 这些年也涌现出如                       Transformer 等功能更强大的神经网络结
                 构. NeRF  和  GS  的出现也使场景表征技术得到了跨越式发展, 与之结合的多视角三维重建具有巨大的发展潜力.
                 此外, 如何创建更为广泛、通用的标准数据集并设立一致的评估指标也是之后发展的方向. 可以预见, 未来多视图
                 立体视觉发展会更加成熟, 也必将应用于更多领域.

                 References:
                  [1]  Peng Y, Wang AD, Wang TT, Li JL, Wang ZQ, Zhao Y, Wang ZL, Zhao Z. Three-dimensional reconstruction of carp brain tissue and
                     brain electrodes for biological control. Journal of Biomedical Engineering, 2020, 37(5): 885–891 (in Chinese with English abstract). [doi:
                     10.7507/1001-5515.201911011]

                  [2]  Nicholson DT, Chalk C, Funnell WRJ, Daniel SJ. Can virtual reality improve anatomy education? A randomised controlled study of a
                     computer-generated three-dimensional anatomical ear model. Medical Education, 2006, 40(11): 1081–1087. [doi: 10.1111/j.1365-2929.
                     2006.02611.x]
                  [3]  Qu YF, Huang JY, Zhang X. Rapid 3D reconstruction for image sequence acquired from UAV camera. Sensors, 2018, 18(1): 225. [doi:
                     10.3390/s18010225]
                  [4]  Carvajal-Ramírez  F,  Navarro-Ortega  AD,  Agüera-Vega  F,  Martínez-Carricondo  P,  Mancini  F.  Virtual  reconstruction  of  damaged
                     archaeological sites based on unmanned aerial vehicle photogrammetry and 3D modelling. Study case of a southeastern Iberia production
                     area in the Bronze Age. Measurement, 2019, 136: 225–236. [doi: 10.1016/j.measurement.2018.12.092]
                  [5]  Gao ZP, Zhai GT, Deng HW, Yang XK. Extended geometric models for stereoscopic 3D with vertical screen disparity. Displays, 2020,
                     65: 101972. [doi: 10.1016/j.displa.2020.101972]
                  [6]  Wang X, Wang C, Liu B, Zhou XQ, Zhang L, Zheng J, Bai X. Multi-view stereo in the deep learning era: A comprehensive review.
                     Displays, 2021, 70: 102102. [doi: 10.1016/j.displa.2021.102102]
                                                                         ®
                  [7]  Furukawa Y, Hernández C. Multi-view stereo: A tutorial. Foundations and Trends  in Computer Graphics and Vision, 2015, 9(1–2):
                     1–148. [doi: 10.1561/0600000052]
                  [8]  Žbontar J, LeCun Y. Computing the stereo matching cost with a convolutional neural network. In: Proc. of the 2015 IEEE Conf. on
                     Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 1592–1599. [doi: 10.1109/CVPR.2015.7298767]
                  [9]  Zagoruyko S, Komodakis N. Learning to compare image patches via convolutional neural networks. In: Proc. of the 2015 IEEE Conf. on
                     Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 4353–4361. [doi: 10.1109/CVPR.2015.7299064]
                 [10]  Han XF, Leung T, Jia YQ, Sukthankar R, Berg AC. MatchNet: Unifying feature and metric learning for patch-based matching. In: Proc.
                     of  the  2015  IEEE  Conf.  on  Computer  Vision  and  Pattern  Recognition.  Boston:  IEEE,  2015.  3279–3286.  [doi: 10.1109/CVPR.2015.
                     7298948]
                 [11]  Murphy K, Schölkopf B, Žbontar J, LeCun Y. Stereo matching by training a convolutional neural network to compare image patches. The
                     Journal of Machine Learning Research, 2016, 17(1): 2287–2318.
                 [12]  Güney F, Geiger A. Displets: Resolving stereo ambiguities using object knowledge. In: Proc. of the 2015 IEEE Conf. on Computer Vision
                     and Pattern Recognition. Boston: IEEE, 2015. 4165–4175. [doi: 10.1109/CVPR.2015.7299044]
                 [13]  Luo WJ, Schwing AG, Urtasun R. Efficient deep learning for stereo matching. In: Proc. of the 2016 IEEE Conf. on Computer Vision and
                     Pattern Recognition. Las Vegas: IEEE, 2016. 5695–5703. [doi: 10.1109/CVPR.2016.614]
                 [14]  Ji MQ, Gall J, Zheng HT, Liu YB, Fang L. SurfaceNet: An end-to-end 3D neural network for multiview stereopsis. In: Proc. of the 2017
                     IEEE Int’l Conf. on Computer Vision. Venice: IEEE, 2017. 2326–2334. [doi: 10.1109/ICCV.2017.253]
                 [15]  Yao Y, Luo ZX, Li SW, Fang T, Quan L. MVSNet: Depth inference for unstructured multi-view stereo. In: Proc. of the 15th European
                     Conf. on Computer Vision. Munich: Springer, 2018. 785–801. [doi: 10.1007/978-3-030-01237-3_47]
                 [16]  Khot  T,  Agrawal  S,  Tulsiani  S,  Mertz  C,  Lucey  S,  Hebert  M.  Learning  unsupervised  multi-view  stereopsis  via  robust  photometric
                     consistency. arXiv:1905.02706, 2019.
                 [17]  Chen AP, Xu ZX, Zhao FQ, Zhang XS, Xiang FB, Yu JY, Su H. MVSNeRF: Fast generalizable radiance field reconstruction from multi-
   299   300   301   302   303   304   305   306   307   308   309