Page 140 - 《软件学报》2020年第12期
P. 140
3806 Journal of Software 软件学报 Vol.31, No.12, December 2020
法.该方法首先对汉语和越南语进行句法解析,得到句法解析树;然后,将句法解析树信息转化为向量表示;最后,
将得到的句法向量融入到神经机器翻译模型编码器的输入中.本文在汉语-越南语、英语-越南语上进行了实验,
同时又对比了不同深度的卷积神经网络、不同大小卷积核对实验结果的影响.结果表明:融入源语言句法解析
树信息,能够有效提高汉-越神经机器翻译模型的性能.当然,该方法也存在一些不足,由于越南语在分词、词性标
记以及句法解析的准确率不足,导致训练过程中的特征提中存在错误,影响最终神经机器翻译模型的性能.在未
来的工作中,我们会探索从神经机器翻译模型的方面进行改进,进一步提升汉-越神经机器翻译模型的性能.
References:
[1] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. In: Proc. of the Advances in Neural
Information Processing Systems 27 (NIPS 2014). 2014. 3104−3112.
[2] Eriguchi A, Hashimoto K, Tsuruoka Y. Tree-to-Sequence attentional neural machine translation. In: Proc. of the 54th Annual
Meeting of the Association for Computational Linguistics. 2016. 823−833.
[3] Eriguchi A, Tsuruoka Y, Cho K. Learning to parse and translate improves neural machine translation. In: Proc. of the 55th Annual
Meeting of the Association for Computational Linguistics. 2017. 72−78.
[4] Aharoni R, Goldberg Y. Towards string-to-tree neural machine translation. In: Proc. of the 55th Annual Meeting of the Association
for Computational Linguistics. 2017. 132−140.
[5] Pust M, Hermjakob U, Knight K, Marcu D, May J. Using syntax-based machine translation to parse English into abstract meaning
representation. Computer Science, 2015, 482−489.
[6] Wu NR, Su YL, Liu WW, Ren QDEJ. Mongolian-Chinese machine translation base on CNN etyma morphological selection model.
Journal of Chinese Information Processing, 2018,32(5):42−48 (in Chinese with English abstract).
[7] Bao WGDL, Zhao XB. Mongolian-Chinese neural machine translation base on RNN and CNN. Journal of Chinese Information
Processing, 2018,32(8):60−67 (in Chinese with English abstract).
[8] Gehring J, Auli M, Grangier D, Grangier D, Yarats D, Dauphin YN. Convolutional sequence to sequence learning. In: Proc. of the
34th Int’l Conf. on Machine Learning (ICML 2017), Vol.70. 2017. 1243−1252.
[9] Meng FD, Lu ZD, Wang MX, Li H, Jiang WB, Liu Q. Encoding source language with convolutional neural network for machine
translation. In: Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int’l Joint Conf. on
Natural Language Processing. 2015. 20−30.
[10] Marcheggiani D, Titov I. Encoding sentences with graph convolutional networks for semantic role labeling. In: Proc. of the 55th
Annual Meeting of the Association for Computational Linguistics. 2017. 1506−1515.
[11] Gehring J, Auli M, Grangier D, Dauphin Y. A convolutional encoder model for neural machine translation. In: Proc. of the 55th
Annual Meeting of the Association for Computational Linguistics. 2017. 123−135
[12] Trinh M, Tran P, Tran N. Collecting Chinese-Vietnamese texts from bilingual websites. In: Proc. of the 5th NAFOSTED Conf. on
Information and Computer Science (NICS). 2018. 260−264.
[13] Tran P, Dinh D, Nguyen LHB. Word re-segmentation in Chinese-Vietnamese machine translation. ACM Trans. on Asian and Low-
Resource Language Information Processing, 2016,16(2):1−22.
[14] Huu AT, Huang HY, Guo Y, Shi SM, Jian P. Integrating pronunciation into Chinese-Vietnamese statistical machine translation.
Tsinghua Science and Technology, 2018,23(6):83−91.
[15] Phuoc T, Dien D, Nguyen HT. A character level based and word level based approach for Chinese-Vietnamese machine translation.
Computational Intelligence and Neuroscience, 2016,2016(2):1−11.
[16] Tran P, Le T, Dinh D, et al. Handling organization name unknown word in Chinese-Vietnamese machine translation. In: Proc. of
the 2013 RIVF Int’l Conf. on Computing & Communication Technologies—Research, Innovation, and Vision for Future (RIVF).
2013. 242−247.
[17] He YJL, Yu ZT, Lv CT, Lai H, Gao SX, Zhang Y. Language post positioned characteristic based Chinese-Vietnamese statistical
machine translation method. In: Proc. of the 21st Int’l Conf. on Asian Language Processing (IALP). 2017.
[18] Wu SZ, Zhang DD, Yang N, Li M, Zhou M. Sequence-to-dependency neural machine translation. In: Proc. of the 55th Annual
Meeting of the Association for Computational Linguistics. 2017. 698−707.