Page 212 - 《软件学报》2025年第5期
P. 212

2112                                                       软件学报  2025  年第  36  卷第  5  期


                     the 17th European Conf. on Computer Vision. Tel Aviv: Springer, 2022. 543–560. [doi: 10.1007/978-3-031-19824-3_32]
                 [27]  Jing LL, Zhang L, Tian YL. Self-supervised feature learning by cross-modality and cross-view correspondences. In: Proc. of the 2021
                     IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops. Nashville: IEEE, 2021. 1581–1591. [doi: 10.1109/CVPRW
                     53098.2021.00174]
                 [28]  Zhang  RR,  Wang  LH,  Qiao  Y,  Gao  P,  Li  HS.  Learning  3D  representations  from  2D  pre-trained  models  via  image-to-point  masked
                     autoencoders.  In:  Proc.  of  the  2023  IEEE/CVF  Conf.  on  Computer  Vision  and  Pattern  Recognition.  Vancouver:  IEEE,  2023.
                     21769–21780. [doi: 10.1109/CVPR52729.2023.02085]
                 [29]  Wang  ZY,  Yu  XM,  Rao  YM,  Zhou  J,  Lu  JW.  P2P:  Tuning  pre-trained  image  models  for  point  cloud  analysis  with  point-to-pixel
                     prompting. In: Proc. of the 36th Int’l Conf. on Neural Information Processing Systems. New Orleans, 2022. 14388–14402.
                 [30]  Dong RP, Qi ZK, Zhang LF, Zhang JB, Sun JJ, Ge Z, Yi L, Ma KS. Autoencoders as cross-modal teachers: Can pretrained 2D image
                     Transformers help 3D representation learning? In: Proc. of the 11th Int’l Conf. on Learning Representations. Kigali: OpenReview.net,
                     2023.
                 [31]  Zhang RR, Guo ZY, Zhang W, Li KC, Miao XP, Cui B, Qiao Y, Gao P, Li HS. PointCLIP: Point cloud understanding by CLIP. In: Proc.
                     of  the  2022  IEEE/CVF  Conf.  on  Computer  Vision  and  Pattern  Recognition.  New  Orleans:  IEEE,  2022.  8552–8562.  [doi:  10.1109/
                     CVPR52688.2022.00836]
                 [32]  Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I. Learning
                     transferable  visual  models  from  natural  language  supervision.  In:  Proc.  of  the  38th  Int’l  Conf.  on  Machine  Learning.  PMLR,  2021.
                     8748–8763.
                 [33]  Zhu XY, Zhang RR, He BW, Guo ZY, Zeng ZY, Qin ZP, Zhang SH, Gao P. PointCLIP V2: Prompting CLIP and GPT for powerful 3D
                     open-world learning. In: Proc. of the 2023 IEEE/CVF Int’l Conf. on Computer Vision. Paris: IEEE, 2023. 2639–2650. [doi: 10.1109/
                     ICCV51070.2023.00249]
                 [34]  Brown TB, Mann B, Ryder N, et al. Language models are few-shot learners. In: Proc. of the 34th Int’l Conf. on Neural Information
                     Processing Systems. Vancouver: Curran Associates Inc., 2020. 159.
                 [35]  Qi ZK, Dong RP, Fan GF, Ge Z, Zhang XY, Ma KS, Yi L. Contrast with reconstruct: Contrastive 3D representation learning guided by
                     generative pretraining. In: Proc. of the 40th Int’l Conf. on Machine Learning. Honolulu: JMLR.org, 2023. 1171.
                 [36]  Chen  HN,  Zhu  YY,  Zhao  JQ,  Tian  Q.  3D  shape  recognition  based  on  multimodal  relation  modeling.  Ruan  Jian  Xue  Bao/Journal  of
                     Software, 2024, 35(5): 2208–2219 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/7026.htm [doi: 10.13328/j.cnki.
                     jos.007026]
                 [37]  Xie CL, Wang CX, Zhang B, Yang H, Chen D, Wen F. Style-based point generator with adversarial rendering for point cloud completion.
                     In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 4619–4628. [doi: 10.1109/
                     CVPR46437.2021.00459]
                 [38]  Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: Proc. of the 2019 IEEE/CVF Conf.
                     on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 4401–4410. [doi: 10.1109/CVPR.2019.00453]
                 [39]  Zhou LQ, Du YL, Wu JJ. 3D shape generation and completion through point-voxel diffusion. In: Proc. of the 2021 IEEE/CVF Int’l Conf.
                     on Computer Vision. Montreal: IEEE, 2021. 5826–5835. [doi: 10.1109/ICCV48922.2021.00577]
                 [40]  Pan L, Chen XY, Cai ZG, Zhang JZ, Zhao HY, Yi S, Liu ZW. Variational relational point completion network. In: Proc. of the 2021
                     IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 8524–8533. [doi: 10.1109/CVPR46437.2021.
                     00842]
                 [41]  Bardes A, Ponce J, LeCun Y. VICReg: Variance-invariance-covariance regularization for self-supervised learning. In: Proc. of the 10th
                     Int’l Conf. on Learning Representations. OpenReview.net, 2022.
                 [42]  Dosovitskiy  A,  Beyer  L,  Kolesnikov  A,  Weissenborn  D,  Zhai  XH,  Unterthiner  T,  Dehghani  M,  Minderer  M,  Heigold  G,  Gelly  S,
                     Uszkoreit J, Houlsby N. An image is worth 16x16 words: Transformers for image recognition at scale. In: Proc. of the 9th Int’l Conf. on
                     Learning Representations. OpenReview.net, 2021.
                 [43]  Ma X, Qin C, You HX, Ran HX, Fu Y. Rethinking network design and local geometry in point cloud: A simple residual MLP framework.
                     In: Proc. of the 10th Int’l Conf. on Learning Representations. OpenReview.net, 2022.
                 [44]  Qian GC, Li YC, Peng HW, Mai JJ, Hammoud H, Elhoseiny M, Ghanem B. PointNeXt: Revisiting PointNet++ with improved training
                     and scaling strategies. In: Proc. of the 36th Int’l Conf. on Neural Information Processing Systems. New Orleans, 2022. 23192–23204.
                 [45]  Sanghi A. Info3D: Representation learning on 3D objects using mutual information maximization and contrastive learning. In: Proc. of
                     the 16th European Conf. on Computer Vision. Glasgow: Springer, 2020. 626–642. [doi: 10.1007/978-3-030-58526-6_37]
                 [46]  Gadelha M, RoyChowdhury A, Sharma G, Kalogerakis E, Cao LL, Learned-Miller E, Wang R, Maji S. Label-efficient learning on point
                     clouds using approximate convex decompositions. In: Proc. of the 16th European Conf. on Computer Vision. Glasgow: Springer, 2020.
                     473–491. [doi: 10.1007/978-3-030-58607-2_28]
   207   208   209   210   211   212   213   214   215   216   217