Page 438 - 《软件学报》2025年第9期
P. 438

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
                 2025,36(9):4349−4372 [doi: 10.13328/j.cnki.jos.007256] [CSTR: 32375.14.jos.007256]  http://www.jos.org.cn
                 ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563



                                                                               *
                 高精度滑动窗口模型下的图流三角形近似计数算法

                 苟向阳  1,2 ,    邹    磊  2 ,    于    旭  1


                 1
                  (香港中文大学 系统工程与工程管理系, 香港 999077)
                 2
                  (北京大学 王选计算机研究所, 北京 100871)
                 通信作者: 邹磊, Email: zoulei@pku.edu.cn

                 摘 要: 近年来, 图流分析在研究领域和工业领域都变得愈发重要. 图流是从数据源持续高速到达的边序列, 这些
                 边组成了一个不断变化的动态图. 在图流上可以进行多种不同的分析, 而三角形计数是其中最基础的操作之一. 由
                 于图流数据规模大, 更新速度高, 在图流上进行精确三角形计数效率较低, 而且并不必要. 因为大部分三角形计数
                 应用都允许一定的误差, 所以, 图流上的近似三角形计数一直都是研究热点之一. 研究基于采样的滑动窗口模型下
                 的图流近似三角形计数. 滑动窗口模型只关注最近到达的图流数据, 较早的图流数据被认定为过期. 它被广泛应用
                 于不同的工业场景和研究工作中. 将一种“采样前计数”的方法与该问题场景下最新的算法结合, 并提出一套策略
                 以应对由于边过期产生的困难. 使用真实数据集展开广泛的实验以测试提出的                          CBS  算法. 实验结果表明, CBS     相
                 比目前最好的工作, 估算误差降低了           70%  以上.
                 关键词: 图流; 三角形计数; 近似算法
                 中图法分类号: TP391


                 中文引用格式: 苟向阳, 邹磊, 于旭. 高精度滑动窗口模型下的图流三角形近似计数算法. 软件学报, 2025, 36(9): 4349–4372. http://
                 www.jos.org.cn/1000-9825/7256.htm
                 英文引用格式: Gou XY, Zou L, Yu X. Approximate Graph Stream Triangle Counting Algorithm Based on Sliding Window Model
                 with High Accuracy. Ruan Jian Xue Bao/Journal of Software, 2025, 36(9): 4349–4372 (in Chinese). http://www.jos.org.cn/1000-9825/
                 7256.htm

                 Approximate Graph Stream Triangle Counting Algorithm Based on Sliding Window Model with
                 High Accuracy

                              1,2
                                       2
                 GOU Xiang-Yang , ZOU Lei , YU Xu 1
                 1
                 (Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong 999077, China)
                 2
                 (Wangxuan Institute of Computer Technology, Peking University, Beijing 100871, China)
                 Abstract:  In  recent  years,  graph  stream  analysis  has  gained  increasing  importance  in  both  research  and  industry.  A  graph  stream  is  a
                 continuous  sequence  of  edges  received  from  a  data  source  at  a  high  speed.  Those  edges  form  a  dynamic  graph  that  is  continuously
                 changing.  Various  analyses  can  be  performed  on  graph  streams.  Among  them,  triangle  counting  is  one  of  the  most  basic  operations.
                 However,  the  large  volume  and  high  update  speed  of  graph  streams  make  it  inefficient  to  count  triangles  accurately  on  them.  It  is  also
                 unnecessary,  as  most  applications  for  triangle  counting  can  tolerate  small  errors.  Therefore,  approximate  triangle  counting  in  graph  streams
                 has  been  a  hot  research  topic.  This  study  focuses  on  sample-based  approximate  triangle  counting  in  graph  streams  with  a  sliding  window
                 model.  Sliding  window  models  focus  on  the  most  recent  edges  in  a  graph  stream  and  consider  older  edges  as  expired.  They  are  widely
                 applied  in  various  industrial  scenarios  and  research.  This  study  combines  a  count-before-sample  strategy  with  the  state-of-the-art
                 approximate  triangle  counting  algorithm  and  designs  a  set  of  novel  strategies  to  deal  with  the  difficulty  brought  by  edge  expiration.
                 Extensive  experiments  are  conducted  on  real-world  datasets  to  evaluate  the  proposed  algorithm.  Results  prove  that  the  algorithm  decreases


                 *    基金项目: 国家自然科学基金  (61932001, U20A20174); 香港研资局优配研究金 (14205520)
                  收稿时间: 2024-01-30; 修改时间: 2024-05-05; 采用时间: 2024-07-08; jos 在线出版时间: 2024-12-31
                  CNKI 网络首发时间: 2025-01-02
   433   434   435   436   437   438   439   440   441   442   443