Page 291 - 《软件学报》2021年第10期

P. 291

周杰英等:融合随机森林和梯度提升树的入侵检测研究 3263

Fig.6 Working characteristic curve and precision-recall curve of subjects
图 6 受试者工作特征曲线和精确率-召回率曲线

实验采用了 UNSW-NB15 数据集训练模型,并且对比了 RF-GBDT、Logistic Regression、AdaBoost 和 K-NN
这 4 种算法在训练集上十折交叉验证的表现.
最终实验结果表明,RF-GBDT 模型具有更高的检测率、更低的误报率.
实验结果显示:RF-GBDT 的检测率是 83.78%、误报率是 1.8%、F1 分数值是 83.78%、ROC AUC 是 98.57%、
PR AUC 是 91.48%.
另外,对于样本量很少的类别,RF-GBDT 的检测率也很高,比如类别“Worns”“Reconnaissance”“Shellcode”和
“Generic”样本数量都很少,但是检测率都在 84%以上.

7 结束语

本文针对网络入侵检测数据不平衡的多分类问题,提出了融合随机森林特征变换和梯度提升树的 RF-
GBDT 入侵检测分类模型框架,该模型框架主要有 3 个部分:特征选择、特征转换和分类器.
使用 GBDT 的特征重要性参数进行特征选择,丢弃无关特征,不仅能够减少计算量、加快训练的速度,还能
提高模型的检测率;使用 RandomForest 训练数据,将样本落到每一棵树的叶子索引作为新的特征;使用 GBDT 进
行分类,调整合适的树的个数和学习率,选择最优的模型参数.
实验结果表明:在 UNSW-NB15 数据集上,本文提出的模型 RF-GBDT 具有检测率较高、误报率较低的特点.
RF-GBDT 能够较准确地检测出网络流量中的攻击类型,尤其能够更好地检测出样本量少的攻击类型.RF-
GBDT 对于解决网络入侵检测数据不平衡的多分类问题,具有较显著的优势.

References:
[1] Lin WC, Ke SW, Tsai CF. CANN: An intrusion detection system based on combining cluster centers and nearest neighbors.
Knowledge-based Systems, 2015,78:1321. [doi: 10.1016/j.knosys.2015.01.009]
[2] Kim G, Lee S, Kim S. A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Expert
Systems with Applications, 2014,41(4):16901700. [doi: 10.1016/j.eswa.2013.08.066]
[3] Karami A, Guerrero-Zapata M. A fuzzy anomaly detection system based on hybrid PSO-K-means algorithm in content-centric
networks. Neurocomputing, 2015,149:12531269. [doi: 10.1016/j.neucom.2014.08.070]
[4] Lawson C, Neiva C. Magic quadrant for intrusion detection and prevention systems. 2018. https://www.gartner.com/doc/3844163?
ref=mrktg-srch
[5] Xie XY. Research on intrusion detection model based on convolutional neural network [MS. Thesis]. Najing: Nanjing University of
Posts and Telecommunications, 2019 (in Chinese with English abstract). [doi: 10.27251/d.cnki.gnjdc.2019. 000590]
[6] Chi YP, Yang YT, Li GF, et al. Design and implementation of network intrusion detection model based on GR-CNN algorithm.
Computer Applications and Software, 2019(12):297302 (in Chinese with English abstract).

286 287 288 289 290 291 292 293 294 295 296