Page 273 - 《软件学报》2021年第8期
P. 273
琚生根 等:基于关联记忆网络的中文细粒度命名实体识别 2555
的标签信息进行处理.所以在对句子包含的实体类别预测效果不佳的情况下,基于编辑距离的记忆句子选择方
式要好于第 2 种.
由于基于编辑距离的模型具有一定的利用正确实体类别信息和处理错误实体类别信息的能力,本文针对
模型识别“地址”类别实体能力差的特点,尝试将所有句子的类别标签信息用“地址”类别的标签嵌入进行替换,
如表 7 所示,发现模型的整体的识别效果得到了大幅度提升.
Table 7 F1 value using enhanced address information
表 7 使用增强地址信息的模型 F1 值
模型 线上效果 F1(%)
RoBERTa-wwm-base-ext+关联记忆网络(地址信息增强) 80.62
RoBERTa-wwm-base-ext+关联记忆网络 79.98
4 总 结
本文充分利用预训练语言模型捕获句子字符的上下文信息,同时利用关联记忆网络,使字符的上下文信息
接近于实体类别的标签信息,并将类别的标签信息融入到序列的字符表示中.最后,利用多头自注意网络高效地
计算了句子任意位置间的关注度,对融入了标签信息的字符表示进行重新编码,增加了实体识别的效果.实验结
果表明,本文模型在细粒度命名实体识别任务中取得了更好的效果.在未来的工作中,希望针对细粒度命名实体
识别,设计多标签文本分类模型来提高预测句子中包含的实体类别的效果,结合本文提出的实体类别距离的计
算方法,来提高模型的识别效果.
References:
[1] Xu L,Tong Y, Dong QQ, Liao YX, Yu C, Tian Y, Liu WT, Li L, Liu CQ, Zhang XW. CLUENER2020: Fine-grained named entity
recognition dataset and benchmark for Chinese. arXiv preprint arXiv:20-01.04351, 2020.
[2] Panchendrarajan R, Amaresan A. Bidirectional LSTM-CRF for named entity recognition. In: Proc. of the 32nd Pacific Asia Conf.
on Language, Information and Computation. 2018. 531−540.
[3] Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. In:
Proc. of the Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies (NAACL-HIL). 2019. 4171−4186.
[4] Wang G, Li C, Wang W, Zhang YZ, Shen DH, Zhang XY, Henao R, Carin L. Joint embedding of words and labels for text
classification. In: Proc. of the 56th Annual Meeting of the Association for Computational Linguistics. 2018. 2321−2331.
[5] Guan CY, Cheng YH, Zhao H. Semantic role labeling with associated memory network. In: Proc. of the 2019 Conf. of the North
American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 2019.
3361−3371.
[6] Xiang XW, Shi XD, Zeng HL. Chinese named entity recognition system using statistics-based and rules-based method. Computer
Applications, 2005,25(10):2404−2406 (in Chinese with English abstract).
[7] Zhang CY, Hong XG, Peng ZH. Extracting Web entity activities based on SVM and extended conditional random fields. Ruan Jian
Xue Bao/Journal of Software, 2012,23(10):2612−2627 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4189.
htm [doi: 10.3724/SP.J.1001.2012.04189]
[8] Peng JY, Fang Y, Huang C, Liu L, Jiang ZW. Cyber security named entity recognition based on deep active learning. Journal of
Sichuan University (Natural Science Edition), 2019,56(3):457−462 (in Chinese with English abstract).
[9] Strubell E, Verga P, Belanger D, McCallum A. Fast and accurate entity recognition with iterated dilated convolutions. In: Proc. of
the Conf. on Empirical Methods in Natural Language Processing. 2017. 2670−2680.
[10] Yang J, Teng ZY, Zhang MS, Zhang Y. Combining discrete and neural features for sequence labeling. In: Proc. of the Int’l Conf.
on Intelligent Text Processing and Computational Linguistics. 2016. 140−154.