Page 120 - 《软件学报》2020年第11期
P. 120

软件学报 ISSN 1000-9825, CODEN RUXUEW                                       E-mail: jos@iscas.ac.cn
                 Journal of Software,2020,31(11):3436−3447 [doi: 10.13328/j.cnki.jos.005863]   http://www.jos.org.cn
                 ©中国科学院软件研究所版权所有.                                                         Tel: +86-10-62562563


                                                         ∗
                 子图相似性的恶意程序检测方法

                 汪   洁,   王长青


                 (中南大学  计算机学院,湖南  长沙  410083)
                 通讯作者:  汪洁, E-mail: jwang@csu.edu.cn

                 摘   要:  动态行为分析是一种常见的恶意程序分析方法,常用图来表示恶意程序系统调用或资源依赖等,通过图
                 挖掘算法找出已知恶意程序样本中公共的恶意特征子图,并通过这些特征子图对恶意程序进行检测.然而这些方法
                 往往依赖于图匹配算法,且图匹配不可避免计算慢,同时,算法中还忽视了子图之间的关系,而考虑子图间的关系有
                 助于提高模型检测效果.为了解决这两个问题,提出了一种基于子图相似性恶意程序检测方法,即 DMBSS.该方法使
                 用数据流图来表示恶意程序运行时的系统行为或事件,再从数据流图中提取出恶意行为特征子图,并使用“逆拓扑
                 标识”算法将特征子图表示成字符串,字符串蕴含了子图的结构信息,使用字符串替代图的匹配.然后,通过神经网络
                 来计算子图间的相似性即将子图结构表示成高维向量,使得相似子图在向量空间的距离也较近.最后,使用子图向量
                 构建恶意程序的相似性函数,并在此基础上,结合 SVM 分类器对恶意程序进行检测.实验结果显示,与其他方法相
                 比,DMBSS 在检测恶意程序时速度较快,且准确率较高.
                 关键词:  恶意程序检测;神经网络;子图分布式表示;图相似函数
                 中图法分类号: TP311

                 中文引用格式:  汪洁,王长青.子图相似性的恶意程序检测方法.软件学报,2020,31(11):3436−3447.  http://www.jos.org.cn/1000-
                 9825/5863.htm
                 英文引用格式: Wang J, Wang CQ. Malware detection  method based on subgraph  similarity.  Ruan Jian Xue Bao/Journal of
                 Software, 2020,31(11):3436−3447 (in Chinese). http://www.jos.org.cn/1000-9825/5863.htm

                 Malware Detection Method Based on Subgraph Similarity
                 WANG Jie,  WANG Chang-Qing

                 (School of Computer Science and Engineering, Central South University, Changsha 410083, China)
                 Abstract:    Dynamic behavior analysis is a common method of malware detection. It uses graphs to represent malware’s system calls or
                 resource dependencies. It uses graph  mining  algorithms to  find  common  malicious feature  subgraphs in known  malware samples,  and
                 detect unknown programs through these features. However, these methods often rely on the graph matching algorithm, and the inevitable
                 calculation of the graph matching is slow, and the relationship between the subgraphs is also neglected in the algorithm. It can improve the
                 detection accuracy of the model if the subgraphs’ relationship is considered. In order to solve these two problems, a sub-graph similarity
                 malware detection method called DMBSS is proposed. It uses the data flow graph to represent the system behavior or event of the running
                 malicious program,  and  then  extracts the  malicious behavior feature  subgraph  from the data flow graph,  and uses “inverse topology
                 identification” algorithm to represent the feature subgraph as a string, and the string implied the structural information of the subgraph,
                 using a string instead of the matching of the graph. The neural network is then used to calculate the similarity between the subgraphs and
                 to represent the subgraph structure as a high dimensional vector, so that the similar subgraphs’ distance is also shorter in the vector space.
                 Finally, the subgraph vector is used to construct the similarity function of the malicious program, and based on this, the SVM classifier is
                 used to detect the malicious program. The experimental results show that compared with other methods, DMBSS is faster in detecting
                 malicious programs and has higher accuracy.

                   ∗  基金项目:  国家自然科学基金(61202495)
                      Foundation item: National Natural Science Foundation of China (61202495)
                     收稿时间: 2018-12-10;  修改时间: 2019-01-17, 2019-03-23;  采用时间: 2019-04-22
   115   116   117   118   119   120   121   122   123   124   125