Page 159 - 《软件学报》2021年第9期
P. 159

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
         Journal of Software,2021,32(9):2783−2800 [doi: 10.13328/j.cnki.jos.005992]   http://www.jos.org.cn
         ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563


                                                                ∗
         基于多通道特征和自注意力的情感分类方法

         李卫疆,   漆   芳,   余正涛


         (昆明理工大学  信息工程与自动化学院,云南  昆明  650500)
         通讯作者:  李卫疆, E-mail: hrbrichard@126.com

         摘   要:  针对情感分析任务中没有充分利用现有的语言知识和情感资源,以及在序列模型中存在的问题:模型会
         将输入文本序列解码为某一个特定的长度向量,如果向量的长度设定过短,会造成输入文本信息丢失.提出了一种基
         于多通道特征和自注意力的双向 LSTM 情感分类方法(MFSA-BiLSTM),该模型对情感分析任务中现有的语言知识
         和情感资源进行建模,形成不同的特征通道,并使用自注意力重点关注加强这些情感信息.MFSA-BiLSTM 可以充分
         挖掘句子中的情感目标词和情感极性词之间的关系,且不依赖人工整理的情感词典.另外,在 MFSA-BiLSTM 模型的
         基础上,针对文档级文本分类任务提出了 MFSA-BiLSTM-D 模型.该模型先训练得到文档的所有的句子表达,再得到
         整个文档表示.最后,对 5 个基线数据集进行了实验验证.结果表明:在大多数情况下,MFSA-BiLSTM 和 MFSA-
         BiLSTM-D 这两个模型在分类精度上优于其他先进的文本分类方法.
         关键词:  情感分类;多通道特征;自注意力;深度学习;双向 LSTM
         中图法分类号: TP391

         中文引用格式:  李卫疆,漆芳,余正涛.基于多通道特征和自注意力的情感分类方法.软件学报,2021,32(9):2783−2800.  http://
         www.jos.org.cn/1000-9825/5992.htm
         英文引用格式: Li WJ, Qi F, Yu ZT. Sentiment classification method based on multi-channel features and self-attention. Ruan
         Jian Xue Bao/Journal of Software, 2021,32(9):2783−2800 (in Chinese). http://www.jos.org.cn/1000-9825/5992.htm

         Sentiment Classification Method Based on Multi-channel Features and Self-attention

         LI Wei-Jiang,  QI Fang,   YU Zheng-Tao
         (Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China)
         Abstract:    The purpose of this study is for the problem that the existing language knowledge and emotion resources are not fully utilized
         in the emotion analysis tasks, as well as the problems in the sequence model: the model will decode the input text sequence into a specific
         length vector, if  the  length of the vector is set too short, the  information of input text  will be lost.  A bidirectional  LSTM sentiment
         classification method is proposed based on multi-channel features and self-attention (MFSA-BiLSTM). This method models the existing
         linguistic knowledge  and sentiment  resources in sentiment  analysis tasks to form different feature  channels,  and uses self-attention
         mechanism to focus on sentiment information. MFSA-BiLSTM model can fully explore the relationship between sentiment target words
         and sentiment polar words in a sentence, and does not rely on a manually compiled sentiment lexicon. In addition, this study proposes the
         MFSA- BiLSTM-D model based on the MFSA-BiLSTM model for document-level text classification tasks. The model first obtains all
         sentence  expressions of the document through  training,  and then gets the  entire document representation. Finally,  experimental
         verifications are conducted on five sentiment classification datasets. The results show that MFSA-BiLSTM and MFSA-BiLSTM-D are
         superior to other state-of-the-art text classification methods in terms of classification accuracy in most cases.
         Key words:    sentiment classification; multi-channel features; self-attention; deep learning; bidirectional LSTM


            ∗  基金项目:  国家自然科学基金(62066022);  国家重点研发计划(2018YFC0830105)
              Foundation item: National Natural Science Foundation of China (62066022); National Key Research and Development Program of
         China (2018YFC 0830105)
              收稿时间: 2019-06-24;  修改时间: 2019-10-31;  采用时间: 2019-12-11
   154   155   156   157   158   159   160   161   162   163   164