Page 169 - 《软件学报》2024年第4期
P. 169

钱鸿  等:  基于动态批量评估的绿色无梯度优化方法                                                       1747











                 (a) Housing                   (b) Housing (分析)                 (c) SST-2 低维                (d) SST-2 低维(泛化)









              (e) SST-2 低维(分析)                 (f) SST-2 高维                  (g) SST-2 高维(泛化)            (h) SST-2 高维(分析)
                                   图 9   不同批量大小在各任务上的实验结果图
         6    总   结

             本文提出了基于动态批量评估的绿色无梯度优化方法框架 GRACE,  基于训练子集的相似性,  通过自适应
         动态调整解的评估使用的训练数据子集大小,  在保证评估准确性与稳定性的同时,  有效地降低了评估代价,
         实现了绿色低碳且高效准确的优化.  在实际任务中,  如语言模型即服务提示词黑盒微调和模型超参数优化,
         通过实施对比实验、消融实验等,  展示了 GRACE 算法在时间效率、优化性能、资源效益等方面的优越性.  通
         过在超参数的不同取值下实验,  并分析算法运行过程中批量数的动态变化情况,  说明了算法的稳健性和可靠
         性.  未来,  我们将进一步设计超参数γ的自适应调节解决方案,  以实现更佳的性能.  此外,  我们将探索 GRACE
         在体量更大的数据集以及模型上的应用,  并提出针对无梯度优化任务的绿色评价新指标.

         References:
          [1]    Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electronic Markets, 2021, 31(3): 685−695.
          [2]    Sarker IH. Machine learning: Algorithms, real-world applications and research directions. SN Computer Science, 2021, 2(3): 160.
          [3]    Shih FY. Image Processing and Pattern Recognition: Fundamentals and Techniques. John Wiley & Sons, 2010.
          [4]    Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance. Int’l Journal of
              Remote Sensing, 2007, 28(5): 823−870.
          [5]    Shen Z, Cui CR, Dong GX, et al. Unified image aesthetic and emotional prediction based on deep multi-task learning. Ruan Jian
              Xue Bao/Journal of Software, 2023, 34(5): 2494−2506 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6487.
              htm [doi: 10.13328/j.cnki.jos.006487]
          [6]    Yu D, Deng L. Automatic Speech Recognition. Berlin: Springer, 2016.
          [7]    Gaikwad SK, Gawali BW, Yannawar P. A review on speech recognition technique. Int’l Journal of Computer Applications, 2010,
              10(3): 16−24.
          [8]    Wolf T, Debut L,  Sanh V,  et al. Transformers: State-of-the-art natural language processing. In: Proc.  of the 2020  Conf.  on
              Empirical Methods in Natural Language Processing. 2020. 38−45.
          [9]    Chowdhary K. Natural Language Processing. Springer, 2020.
         [10]    Galassi A, Lippi M, Torroni P. Attention in natural language processing. IEEE Trans. on Neural Networks and Learning Systems,
              2020, 32(10): 4291−4308.
         [11]    Wang NY, Ye YX, Liu L, et al. Language models based on deep learning: a review. Ruan Jian Xue Bao/Journal of Software, 2021,
              32(4): 1082−1115 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6169.htm [doi: 10.13328/j.cnki.jos.006169]
   164   165   166   167   168   169   170   171   172   173   174