Page 282 - 《软件学报》2021年第10期
P. 282
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2021,32(10):32543265 [doi: 10.13328/j.cnki.jos.006062] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
融合随机森林和梯度提升树的入侵检测研究
周杰英, 贺鹏飞, 邱荣发, 陈 国, 吴维刚
(中山大学 数据科学与计算机学院,广东 广州 510006)
通讯作者: 吴维刚, Email: wuweig@mail.sysu.edu.cn
摘 要: 网络入侵检测系统作为一种保护网络免受攻击的安全防御技术,在保障计算机系统和网络安全领域起着
非常重要的作用.针对网络入侵检测中数据不平衡的多分类问题,机器学习已被广泛用于入侵检测,比传统方法更智
能、更准确.对现有的网络入侵检测多分类方法进行了改进研究,提出了一种融合随机森林模型进行特征转换、使
用梯度提升决策树模型进行分类的入侵检测模型 RF-GBDT,该模型主要分为特征选择、特征转换和分类器这 3 个
部分.采用 UNSW-NB15 数据集对 RF-GBDT 模型进行了实验测试,与其他 3 种同领域的算法相比,RF-GBDT 既缩短
了训练时间,又具有较高的检测率和较低的误报率,在测试数据集上受试者工作特征曲线下的面积可达 98.57%.
RF-GBDT 对于解决网络入侵检测数据不平衡的多分类问题具有较显著的优势,是一种切实可行的入侵检测方法.
关键词: 网络入侵检测;数据不平衡;随机森林;梯度提升树;UNSW-NB15 数据集
中图法分类号: TP309
中文引用格式: 周杰英,贺鹏飞,邱荣发,陈国,吴维刚.融合随机森林和梯度提升树的入侵检测研究.软件学报,2021,32(10):
32543265. http://www.jos.org.cn/1000-9825/6062.htm
英文引用格式: Zhou JY, He PF, Qiu RF, Chen G, Wu WG. Research on intrusion detection based on random forest and gradient
boosting tree. Ruan Jian Xue Bao/Journal of Software, 2021,32(10):32543265 (in Chinese). http://www.jos.org.cn/1000-9825/
6062.htm
Research on Intrusion Detection Based on Random Forest and Gradient Boosting Tree
ZHOU Jie-Ying, HE Peng-Fei, QIU Rong-Fa, CHEN Guo, WU Wei-Gang
(School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China)
Abstract: As a security defense technique to protect the network from attacks, the system of network intrusion detection system, as a
security defense technology to protect the network from attacks, plays a very important crucial role in the field of guaranteeing computer
system and network security. However, for the multi-classification problem of unbalanced data in network intrusion detection data,
machine learning has been widely used in intrusion detection so as to achieve high intelligence and accuracy. In this paper, the current
multi-classification method for network intrusion detection is improved, and an intrusion detection model RF-GBDT is proposed, which
applies based on the random forest model for to feature conversion and classification using the model of gradient boosting decision tree to
classification model is proposed. The model is mainly includes divided into three parts: Feature selection, feature conversion, and
classifier. The UNSW-NB15 dataset was used for the experimental data set to test; experimental tests were carried out on the RF-GBDT
model. Compared with the other three algorithms in the same field, RF-GBDT, this model not only reduces training time, but also has a
higher detection rate and a lower false alarm rate. The area under the subject’s working characteristic curve on the test data set can reach
98.57%. RF-GBDT, the proposed model has significant advantages in solving the multi-class problem of multi-classification of
unbalanced data in network intrusion detection data and is a feasible method for network intrusion detection.
基金项目: 国家重点研发计划(2018YFB0203803); 国家自然科学基金(U1711263, U1801266); 广东省自然科学基金(2018
A030313492, 2018B030312002)
Foundation item: National Key Research and Development Project of China (2018YFB0203803); National Natural Science
Foundation of China (U1711263, U1801266); Natural Science Foundation of Guangdong Province of China (2018A030313492,
2018B030312002)
收稿时间: 2019-09-12; 修改时间: 2020-02-01; 采用时间: 2020-04-13