Page 167 - 《软件学报》2021年第10期
P. 167
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2021,32(10):31393150 [doi: 10.13328/j.cnki.jos.006028] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
融入案件辅助句的低频和易混淆罪名预测
1,2
1,2
1,2
1,2
郭军军 , 刘真丞 , 余正涛 , 黄于欣 , 相 艳 1,2
1
(昆明理工大学 信息工程与自动化学院,云南 昆明 650500)
2
(云南省人工智能实验室(昆明理工大学),云南 昆明 650500)
通讯作者: 余正涛, E-mail: ztyu@hotmail.com
摘 要: 由于低频罪名数据量较少和易混淆罪名案情描述相似等原因,导致低频和易混淆罪名预测效果不佳.为
了解决此类问题,通过构建案件辅助句,提出一种基于双向互注意力机制的案件辅助句融合方法,实现罪名预测.主
要包括以下 3 部分:首先,基于司法领域知识构建案件辅助句,将案件辅助句作为案情描述和罪名之间的映射知识;
然后,基于词级和字符级表征分别提取案情描述与案件辅助句多粒度特征;同时,借助案件辅助句与案情描述双向注
意机制,获得具有辅助句倾向性的案情描述表征,并最终实现低频和易混淆罪名的预测.基于中国刑事案件公共数据
集的实验结果表明:所提方法在 F1 值最大提升 13.2%,准确率最大提升 4.5%,低频罪名预测 F1 值提升 4.3%,易混淆
罪名预测 F1 值提升 8.2%,所提算法显著地提升了低频和易混淆罪名的预测性能.
关键词: 低频罪名;易混淆罪名;双向互注意力;多粒度编码;案件辅助句
中图法分类号: TP18
中文引用格式: 郭军军,刘真丞,余正涛,黄于欣,相艳.融入案件辅助句的低频和易混淆罪名预测.软件学报,2021,32(10):
31393150. http://www.jos.org.cn/1000-9825/6028.htm
英文引用格式: Guo JJ, Liu ZC, Yu ZT, Huang YX, Xiang Y. Few shot and confusing charges prediction with the auxiliary
sentences of case. Ruan Jian Xue Bao/Journal of Software, 2021,32(10):31393150 (in Chinese). http://www.jos.org.cn/1000-
9825/6028.htm
Few Shot and Confusing Charges Prediction with the Auxiliary Sentences of Case
1,2
1,2
1,2
1,2
GUO Jun-Jun , LIU Zhen-Cheng , YU Zheng-Tao , HUANG Yu-Xin , XIANG Yan 1,2
1
(Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China)
2
(Yunnan Key Laboratory of Artificial Intelligence (Kunming University of Science and Technology), Kunming 650500, China)
Abstract: Due to the insufficiency of few shot charges and the similarity of case descriptions for the confusing charges, the prediction
performance of the existing methods for few shot charges and confusing charges is not promising. To address the forementioned
drawbacks, a novel few shot and confusing charges prediction method is proposed, which is based on bi-direction mutual attention
mechanism with the auxiliary sentences of case. For the proposed model, firstly, the auxiliary sentence of case via the judicial field is
constructed, where the auxiliary sentence of case is considered as external knowledge for mapping the description of the case to the
corresponding charge. Secondly, the multi-granularity characteristics of case description and the auxiliary sentence of case are extracted at
the level of both word and character, respectively. At the same time, the auxiliary sentence of case and case description are used to build
基金项目 : 国家重点研发计 划 (2018YFC0830105, 2018YFC0830101, 2018YFC0830100); 国家自然科 学基金 (61972186,
61762056, 61472168, 61866020); 云南省科技厅省级人培项目(KKSY201703015); 云南省基础研究专项面上项目(2019FB082,
202001AT070047)
Foundation item: National Key Research and Development Program of China (2018YFC0830105, 2018YFC0830101,
2018YFC0830100); National Natural Science Foundation of China (61972186, 61762056, 61472168, 61866020); Provincial Personnel
Training Project of Yunnan Science and Technology Department (KKSY201703015); Natural Science Foundation Project of Yunnan
Science and Technology Department (2019FB082, 202001AT070047)
收稿时间: 2019-12-06; 修改时间: 2020-02-09; 采用时间: 2020-03-02