Page 159 - 《软件学报》2021年第9期
P. 159
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2021,32(9):2783−2800 [doi: 10.13328/j.cnki.jos.005992] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
∗
基于多通道特征和自注意力的情感分类方法
李卫疆, 漆 芳, 余正涛
(昆明理工大学 信息工程与自动化学院,云南 昆明 650500)
通讯作者: 李卫疆, E-mail: hrbrichard@126.com
摘 要: 针对情感分析任务中没有充分利用现有的语言知识和情感资源,以及在序列模型中存在的问题:模型会
将输入文本序列解码为某一个特定的长度向量,如果向量的长度设定过短,会造成输入文本信息丢失.提出了一种基
于多通道特征和自注意力的双向 LSTM 情感分类方法(MFSA-BiLSTM),该模型对情感分析任务中现有的语言知识
和情感资源进行建模,形成不同的特征通道,并使用自注意力重点关注加强这些情感信息.MFSA-BiLSTM 可以充分
挖掘句子中的情感目标词和情感极性词之间的关系,且不依赖人工整理的情感词典.另外,在 MFSA-BiLSTM 模型的
基础上,针对文档级文本分类任务提出了 MFSA-BiLSTM-D 模型.该模型先训练得到文档的所有的句子表达,再得到
整个文档表示.最后,对 5 个基线数据集进行了实验验证.结果表明:在大多数情况下,MFSA-BiLSTM 和 MFSA-
BiLSTM-D 这两个模型在分类精度上优于其他先进的文本分类方法.
关键词: 情感分类;多通道特征;自注意力;深度学习;双向 LSTM
中图法分类号: TP391
中文引用格式: 李卫疆,漆芳,余正涛.基于多通道特征和自注意力的情感分类方法.软件学报,2021,32(9):2783−2800. http://
www.jos.org.cn/1000-9825/5992.htm
英文引用格式: Li WJ, Qi F, Yu ZT. Sentiment classification method based on multi-channel features and self-attention. Ruan
Jian Xue Bao/Journal of Software, 2021,32(9):2783−2800 (in Chinese). http://www.jos.org.cn/1000-9825/5992.htm
Sentiment Classification Method Based on Multi-channel Features and Self-attention
LI Wei-Jiang, QI Fang, YU Zheng-Tao
(Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China)
Abstract: The purpose of this study is for the problem that the existing language knowledge and emotion resources are not fully utilized
in the emotion analysis tasks, as well as the problems in the sequence model: the model will decode the input text sequence into a specific
length vector, if the length of the vector is set too short, the information of input text will be lost. A bidirectional LSTM sentiment
classification method is proposed based on multi-channel features and self-attention (MFSA-BiLSTM). This method models the existing
linguistic knowledge and sentiment resources in sentiment analysis tasks to form different feature channels, and uses self-attention
mechanism to focus on sentiment information. MFSA-BiLSTM model can fully explore the relationship between sentiment target words
and sentiment polar words in a sentence, and does not rely on a manually compiled sentiment lexicon. In addition, this study proposes the
MFSA- BiLSTM-D model based on the MFSA-BiLSTM model for document-level text classification tasks. The model first obtains all
sentence expressions of the document through training, and then gets the entire document representation. Finally, experimental
verifications are conducted on five sentiment classification datasets. The results show that MFSA-BiLSTM and MFSA-BiLSTM-D are
superior to other state-of-the-art text classification methods in terms of classification accuracy in most cases.
Key words: sentiment classification; multi-channel features; self-attention; deep learning; bidirectional LSTM
∗ 基金项目: 国家自然科学基金(62066022); 国家重点研发计划(2018YFC0830105)
Foundation item: National Natural Science Foundation of China (62066022); National Key Research and Development Program of
China (2018YFC 0830105)
收稿时间: 2019-06-24; 修改时间: 2019-10-31; 采用时间: 2019-12-11