Page 459 - 《软件学报》2025年第5期
P. 459
王少鹏 等: 面向时间有序事务数据的聚簇频繁模式挖掘 2359
ICFPM 算法相对于其他 4 种算法空间开销总是最少, 并且 Naive-NoPrune 算法要好于 Naive 算法, 这与空间复杂
度分析结果相同. 当 maxNR 取 0 时, Naive 算法要比 ICFPM 多消耗约 400 MB 空间.
5 总 结
本文首次针对时间有序事务数据中聚簇频繁模式的挖掘问题进行了研究. 首先给出了该问题的形式化描述,
接着提出了一种聚簇频繁模式挖掘算法 ICFPM. 该算法在 Naive 算法中引入了 2 种有效的优化策略以及高效的数
据结构 ICFPM-list, 有效地减少了 Naive 算法的冗余计算. 实验结果表明, ICFPM 算法中使用的优化策略和数据结
构都是有效的, 算法在时间和空间开销方面都要好于 Naive 方法, 可以有效解决时间有序事务数据中挖掘聚簇频
繁模式的问题.
References:
[1] Agrawal R, Imieliński T, Swami A. Mining association rules between sets of items in large databases. In: Proc. of the 1993 ACM
SIGMOD Int’l Conf. on Management of Data. Washington: ACM, 1993. 207–216. [doi: 10.1145/170035.170072]
[2] Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In: Proc. of the 20th Int’l Conf. on Very Large Data
Bases. San Francisco: Morgan Kaufmann Publishers Inc., 1994. 487–499.
[3] Han JW, Pei J, Yin YW. Mining frequent patterns without candidate generation. ACM SIGMOD Record, 2000, 29(2): 1–12. [doi: 10.
1145/335191.335372]
[4] Zaki MJ. Scalable algorithms for association mining. IEEE Trans. on Knowledge and Data Engineering, 2000, 12(3): 372–390. [doi: 10.
1109/69.846291]
[5] Guo YH, Tong YH, Tang SW, Yang DQ. Inverse frequent itemset mining based on FP-Tree. Ruan Jian Xue Bao/Journal of Software,
2008, 19(2): 338–350 (in Chinese with English abstract). https://jos.org.cn/jos/article/abstract/20080215 [doi: 10.3724/SP.J.1001.2008.
00338]
[6] Ding JM, Li HB, Deng B, Jia LY, You JG. Fast mining algorithm of frequent itemset based on spark. Ruan Jian Xue Bao/Journal of
Software, 2023, 34(5): 2446–2464 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6404.htm [doi: 10.13328/j.cnki.
jos.006404]
[7] Fournier-Viger P, Yang P, Kiran RU, Ventura S, Luna JM. Mining local periodic patterns in a discrete sequence. Information Sciences,
2021, 544: 519–548. [doi: 10.1016/j.ins.2020.09.044]
[8] Fournier-Viger P, Tseng VS. Mining top-K non-redundant association rules. In: Proc. of the 20th Int’l Symp. on Methodologies for
Intelligent Systems. Macao: Springer, 2012. 31–40. [doi: 10.1007/978-3-642-34624-8_4]
[9] Agouti T. Graph-based modeling using association rule mining to detect influential users in social networks. Expert Systems with
Applications, 2022, 202: 117436. [doi: 10.1016/j.eswa.2022.117436]
[10] Yao QY, Yang H, Bao BW, Yu A, Zhang J, Cheriet M. Core and spectrum allocation based on association rules mining in spectrally and
spatially elastic optical networks. IEEE Trans. on Communications, 2021, 69(8): 5299–5311. [doi: 10.1109/TCOMM.2021.3082768]
[11] Agrawal R, Srikant R. Mining sequential patterns. In: Proc. of the 11th Int’l Conf. on Data Engineering. Taipei: IEEE, 1995. 3–14. [doi:
10.1109/ICDE.1995.380415]
[12] Han J, Pei JW, Mortazavi-Asl B, Pinto H, Chen QM, Dayal U, Hsu MC. PrefixSpan: Mining sequential patterns efficiently by prefix-
projected pattern growth. In: Proc. of the 17th Int’l Conf. on Data Engineering. Heidelberg: IEEE, 2001. 215–224. [doi: 10.1109/ICDE.
2001.914830]
[13] Huynh HM, Nguyen LTT, Pham NN, Oplatková ZK, Yun U, Vo B. An efficient method for mining sequential patterns with indices.
Knowledge-based Systems, 2022, 239: 107946. [doi: 10.1016/j.knosys.2021.107946]
[14] Zhang W, Yoshida T, Tang XJ, Wang Q. Text clustering using frequent itemsets. Knowledge-based Systems, 2010, 23(5): 379–388. [doi:
10.1016/j.knosys.2010.01.011]
[15] Zhang LK, Yang GF. Cluster analysis of PM 2.5 pollution in China using the frequent itemset clustering approach. Environmental
Research, 2022, 204: 112009. [doi: 10.1016/j.envres.2021.112009]
[16] Dong GZ, Li JY. Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. of the 5th ACM SIGKDD Int’l
Conf. on Knowledge Discovery and Data Mining. San Diego: ACM, 1999. 43–52. [doi: 10.1145/312129.312191]
[17] Zhang CS, Liu CC, Zhang XL, Almpanidis G. An up-to-date comparison of state-of-the-art classification algorithms. Expert Systems with
Applications, 2017, 82: 128–150. [doi: 10.1016/j.eswa.2017.04.003]