Page 438 - 《软件学报》2025年第9期
P. 438
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
2025,36(9):4349−4372 [doi: 10.13328/j.cnki.jos.007256] [CSTR: 32375.14.jos.007256] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
*
高精度滑动窗口模型下的图流三角形近似计数算法
苟向阳 1,2 , 邹 磊 2 , 于 旭 1
1
(香港中文大学 系统工程与工程管理系, 香港 999077)
2
(北京大学 王选计算机研究所, 北京 100871)
通信作者: 邹磊, Email: zoulei@pku.edu.cn
摘 要: 近年来, 图流分析在研究领域和工业领域都变得愈发重要. 图流是从数据源持续高速到达的边序列, 这些
边组成了一个不断变化的动态图. 在图流上可以进行多种不同的分析, 而三角形计数是其中最基础的操作之一. 由
于图流数据规模大, 更新速度高, 在图流上进行精确三角形计数效率较低, 而且并不必要. 因为大部分三角形计数
应用都允许一定的误差, 所以, 图流上的近似三角形计数一直都是研究热点之一. 研究基于采样的滑动窗口模型下
的图流近似三角形计数. 滑动窗口模型只关注最近到达的图流数据, 较早的图流数据被认定为过期. 它被广泛应用
于不同的工业场景和研究工作中. 将一种“采样前计数”的方法与该问题场景下最新的算法结合, 并提出一套策略
以应对由于边过期产生的困难. 使用真实数据集展开广泛的实验以测试提出的 CBS 算法. 实验结果表明, CBS 相
比目前最好的工作, 估算误差降低了 70% 以上.
关键词: 图流; 三角形计数; 近似算法
中图法分类号: TP391
中文引用格式: 苟向阳, 邹磊, 于旭. 高精度滑动窗口模型下的图流三角形近似计数算法. 软件学报, 2025, 36(9): 4349–4372. http://
www.jos.org.cn/1000-9825/7256.htm
英文引用格式: Gou XY, Zou L, Yu X. Approximate Graph Stream Triangle Counting Algorithm Based on Sliding Window Model
with High Accuracy. Ruan Jian Xue Bao/Journal of Software, 2025, 36(9): 4349–4372 (in Chinese). http://www.jos.org.cn/1000-9825/
7256.htm
Approximate Graph Stream Triangle Counting Algorithm Based on Sliding Window Model with
High Accuracy
1,2
2
GOU Xiang-Yang , ZOU Lei , YU Xu 1
1
(Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong 999077, China)
2
(Wangxuan Institute of Computer Technology, Peking University, Beijing 100871, China)
Abstract: In recent years, graph stream analysis has gained increasing importance in both research and industry. A graph stream is a
continuous sequence of edges received from a data source at a high speed. Those edges form a dynamic graph that is continuously
changing. Various analyses can be performed on graph streams. Among them, triangle counting is one of the most basic operations.
However, the large volume and high update speed of graph streams make it inefficient to count triangles accurately on them. It is also
unnecessary, as most applications for triangle counting can tolerate small errors. Therefore, approximate triangle counting in graph streams
has been a hot research topic. This study focuses on sample-based approximate triangle counting in graph streams with a sliding window
model. Sliding window models focus on the most recent edges in a graph stream and consider older edges as expired. They are widely
applied in various industrial scenarios and research. This study combines a count-before-sample strategy with the state-of-the-art
approximate triangle counting algorithm and designs a set of novel strategies to deal with the difficulty brought by edge expiration.
Extensive experiments are conducted on real-world datasets to evaluate the proposed algorithm. Results prove that the algorithm decreases
* 基金项目: 国家自然科学基金 (61932001, U20A20174); 香港研资局优配研究金 (14205520)
收稿时间: 2024-01-30; 修改时间: 2024-05-05; 采用时间: 2024-07-08; jos 在线出版时间: 2024-12-31
CNKI 网络首发时间: 2025-01-02

