Page 293 - 《软件学报》2025年第9期
P. 293
4204 软件学报 2025 年第 36 卷第 9 期
[3] Ye J, Chen Z, Liu JH, Du B. TextFuseNet: Scene text detection with richer fused features. In: Proc. of the 29th Int’l Joint Conf. on
Artificial Intelligence. Yokohama: IJCAI, 2020. 516–522.
[4] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503.02531, 2015.
[5] Gao H, Tian YL, Xu FY, Zhong S. Survey of deep learning model compression and acceleration. Ruan Jian Xue Bao/Journal of Software,
2021, 32(1): 68–92 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6096.htm [doi: 10.13328/j.cnki.jos.006096]
[6] Li ZH, Xu PF, Chang XJ, Yang LY, Zhang YY, Yao LN, Chen XJ. When object detection meets knowledge distillation: A survey. IEEE
Trans. on Pattern Analysis and Machine Intelligence, 2023, 45(8): 10555–10579. [doi: 10.1109/TPAMI.2023.3257546]
[7] Du YN, Li CX, Guo RY, Cui C, Liu WW, Zhou J, Lu B, Yang YH, Liu QW, Hu XG, Yu DH, Ma YJ. PP-OCRv2: Bag of tricks for ultra
lightweight OCR system. arXiv:2109.03144, 2021.
[8] Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y. FitNets: Hints for thin deep nets. arXiv:1412.6550, 2014.
[9] Krizhevsky A. Learning multiple layers of features from tiny images [MS. Thesis]. Toronto: University of Toronto, 2009.
[10] He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In: Proc. of the 2016 IEEE Conf. on Computer Vision
and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778. [doi: 10.1109/CVPR.2016.90]
[11] Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar VR, Lu SJ, Shafait
F, Uchida S, Valveny E. ICDAR 2015 competition on robust reading. In: Proc. of the 13th Int’l Conf. on Document Analysis and
Recognition. Tunis: IEEE, 2015. 1156–1160. [doi: 10.1109/ICDAR.2015.7333942]
[12] Liao MH, Wan ZY, Yao C, Chen K, Bai X. Real-time scene text detection with differentiable binarization. In: Proc. of the 34th AAAI
Conf. on Artificial Intelligence. New York: AAAI, 2020. 11474–11481. [doi: 10.1609/aaai.v34i07.6812]
[13] Lyu PY, Liao MH, Yao C, Wu WH, Bai X. Mask TextSpotter: An end-to-end trainable neural network for spotting text with arbitrary
shapes. In: Proc. of the 15th European Conf. on Computer Vision. Munich: Springer, 2018. 71–88. [doi: 10.1007/978-3-030-01264-9_5]
[14] He KM, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision. Venice: IEEE,
2017. 2980–2988. [doi: 10.1109/ICCV.2017.322]
[15] Zhou XY, Yao C, Wen H, Wang YZ, Zhou SC, He WR, Liang JJ. EAST: An efficient and accurate scene text detector. In: Proc. of the
2017 IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 2642–2651. [doi: 10.1109/CVPR.2017.283]
[16] Tian Z, Huang WL, He T, He P, Qiao Y. Detecting text in natural image with connectionist text proposal network. In: Proc. of the 14th
European Conf. on Computer Vision. Amsterdam: Springer, 2016. 56–72. [doi: 10.1007/978-3-319-46484-8_4]
[17] Ren SQ, He KM, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proc. of the
28th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2015. 91–99.
[18] Liao MH, Shi BG, Bai X, Wang XG, Liu WY. Textboxes: A fast text detector with a single deep neural network. In: Proc. of the 31st
AAAI Conf. on Artificial Intelligence. San Francisco: AAAI, 2017. 4161–4167. [doi: 10.1609/aaai.v31i1.11196]
[19] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC. SSD: Single shot multibox detector. In: Proc. of the 14th European
Conf. on Computer Vision. Amsterdam: Springer, 2016. 21–37. [doi: 10.1007/978-3-319-46448-0_2]
[20] Qin XG, Zhou Y, Guo YH, Wu DY, Tian ZH, Jiang N, Wang HB, Wang WP. Mask is all you need: Rethinking mask R-CNN for dense
and arbitrary-shaped scene text detection. In: Proc. of the 29th ACM Int’l Conf. on Multimedia. Association for Computing Machinery,
2021. 414–423. [doi: 10.1145/3474085.3475178]
[21] Dai P, Zhang S, Zhang H, et al. Progressive contour regression for arbitrary-shape scene text detection. In: Proc. of the 2021 IEEE/CVF
Conf. on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 7389–7398. [doi: 10.1109/CVPR46437.2021.00731]
[22] Peng SD, Jiang W, Pi HJ, Li XL, Bao HJ, Zhou XW. Deep snake for real-time instance segmentation. In: Proc. of the 2020 IEEE/CVF
Conf. on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 8530–8539. [doi: 10.1109/CVPR42600.2020.00856]
[23] Liao MH, Zou ZS, Wan ZY, Yao C, Bai X. Real-time scene text detection with differentiable binarization and adaptive scale fusion. IEEE
Trans. on Pattern Analysis and Machine Intelligence, 2023, 45(1): 919–931. [doi: 10.1109/TPAMI.2022.3155612]
[24] He T, Shen CH, Tian Z, Gong D, Sun CM, Yan YL. Knowledge adaptation for efficient semantic segmentation. In: Proc. of the 2019
IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 578–587. [doi: 10.1109/CVPR.2019.00067]
[25] Liu YF, Chen K, Liu C, Qin ZC, Luo ZB, Wang JD. Structured knowledge distillation for semantic segmentation. In: Proc. of the 2019
IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 2599–2608. [doi: 10.1109/CVPR.2019.00271]
[26] Zhang LF, Song JB, Gao AN, Chen JW, Bao CL, Ma KS. Be your own teacher: Improve the performance of convolutional neural
networks via self distillation. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision. Seoul: IEEE, 2019. 3712–3721. [doi: 10.
1109/ICCV.2019.00381]
[27] Hou YN, Ma Z, Liu CX, Loy CC. Learning lightweight lane detection CNNs by self attention distillation. In: Proc. of the 2019
IEEE/CVF Int’l Conf. on Computer Vision. Seoul: IEEE, 2019. 1013–1021. [doi: 10.1109/ICCV.2019.00110]

