Page 304 - 《软件学报》2024年第6期
P. 304
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2024,35(6):2880−2902 [doi: 10.13328/j.cnki.jos.006918] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
*
LibPass: 基于包结构和签名的第三方库检测方法
徐 建, 袁倩婷
(南京理工大学 计算机科学与工程学院, 江苏 南京 210094)
通信作者: 徐建, E-mail: dolphin.xu@njust.edu.cn
摘 要: 第三方库检测是 Android 应用安全分析领域的上游任务, 其检测精度对于恶意应用检测、重打包检测、
隐私泄露等下游任务有显著影响. 为了提升检测精度和效率, 采用相似性比较的思想, 提出一种基于包结构和签名
的第三方库检测方法, 命名为 LibPass. LibPass 以流水线式模式组合主模块识别、第三方库候选识别和细粒度检测
等 3 个组件. 主模块识别方法区分主程序二进制代码与引入的第三方库二进制代码, 旨在提升方法检测效率. 在此
of package structure features to deal with obfuscation of applications to improve detection accuracy and identifies candidate TPLs by
基础上, 提出由第三方库候选识别和细粒度检测构成的两阶段检测方法. 前者利用包结构特征的稳定性来应对应
用程序的混淆行为以提升混淆情形下的检测精度, 并利用包结构签名完成快速比对以识别候选第三方库, 达到显
著降低成对比较次数、提升检测效率的目的; 后者在前者涮选出的候选中, 通过更细粒度但代价更高的相似性分
析精确地识别第三方库及其对应的版本. 为了验证方法的性能和效率, 构建 3 个评估不同检测能力的基准数据集,
在这些基准数据集上开展实验验证, 从检测性能、检测效率和抗混淆性等方面对实验结果进行深入分析, 结果表
明 LibPass 具备较高的检测精度, 检测效率, 以及应对多种常用混淆操作的能力.
关键词: 第三方库; 代码混淆; 安全分析; 签名
中图法分类号: TP311
中文引用格式: 徐建, 袁倩婷. LibPass: 基于包结构和签名的第三方库检测方法. 软件学报, 2024, 35(6): 2880–2902. http://www.jos.
org.cn/1000-9825/6918.htm
英文引用格式: Xu J, Yuan QT. LibPass: Third-party Library Detection Method Based on Package Structure and Signature. Ruan Jian
Xue Bao/Journal of Software, 2024, 35(6): 2880–2902 (in Chinese). http://www.jos.org.cn/1000-9825/6918.htm
LibPass: Third-party Library Detection Method Based on Package Structure and Signature
XU Jian, YUAN Qian-Ting
(School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China)
Abstract: Third-party library (TPL) detection is an upstream task in the domain of Android application security analysis, and its detection
accuracy has a significant impact on its downstream tasks including malware detection, repackaged application detection, and privacy
leakage detection. To improve detection accuracy and efficiency, this study proposes a package structure and signature-based TPL detection
method, named LibPass, by leveraging the idea of pairwise comparison. LibPass combines primary module identification, TPL candidate
identification, and fine-grained detection in a streamlined way. The primary module identification aims at improving detection efficiency by
distinguishing the binary code of the main program from that of the introduced TPL. On this basis, a two-stage detection method
consisting of TPL candidate identification and fine-grained detection is proposed. The TPL candidate identification leverages the stability
rapidly comparing package structure signatures to reduce the number of pairwise comparisons, so as to improve the detection efficiency.
The fine-grained detection accurately identifies the TPL of a specific version by a finer-grained but more costly pairwise comparison
among candidate TPLs. In order to validate the performance and the efficiency of the detection method, three benchmark datasets are built
* 基金项目: 国家自然科学基金 (61872186, 61802205)
收稿时间: 2021-02-25; 修改时间: 2021-06-09, 2022-07-08, 2022-08-21, 2023-01-13; 采用时间: 2023-02-08; jos 在线出版时间: 2023-07-26
CNKI 网络首发时间: 2023-07-27