Page 459 - 《软件学报》2025年第5期
P. 459

王少鹏 等: 面向时间有序事务数据的聚簇频繁模式挖掘                                                      2359


                 ICFPM  算法相对于其他     4  种算法空间开销总是最少, 并且        Naive-NoPrune 算法要好于   Naive 算法, 这与空间复杂
                 度分析结果相同. 当      maxNR  取  0  时, Naive 算法要比  ICFPM  多消耗约  400 MB  空间.

                 5   总 结

                    本文首次针对时间有序事务数据中聚簇频繁模式的挖掘问题进行了研究. 首先给出了该问题的形式化描述,
                 接着提出了一种聚簇频繁模式挖掘算法              ICFPM. 该算法在   Naive 算法中引入了    2  种有效的优化策略以及高效的数
                 据结构   ICFPM-list, 有效地减少了  Naive 算法的冗余计算. 实验结果表明, ICFPM        算法中使用的优化策略和数据结
                 构都是有效的, 算法在时间和空间开销方面都要好于                 Naive 方法, 可以有效解决时间有序事务数据中挖掘聚簇频
                 繁模式的问题.

                 References:
                  [1]  Agrawal  R,  Imieliński  T,  Swami  A.  Mining  association  rules  between  sets  of  items  in  large  databases.  In:  Proc.  of  the  1993  ACM
                     SIGMOD Int’l Conf. on Management of Data. Washington: ACM, 1993. 207–216. [doi: 10.1145/170035.170072]
                  [2]  Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In: Proc. of the 20th Int’l Conf. on Very Large Data
                     Bases. San Francisco: Morgan Kaufmann Publishers Inc., 1994. 487–499.
                  [3]  Han JW, Pei J, Yin YW. Mining frequent patterns without candidate generation. ACM SIGMOD Record, 2000, 29(2): 1–12. [doi: 10.
                     1145/335191.335372]
                  [4]  Zaki MJ. Scalable algorithms for association mining. IEEE Trans. on Knowledge and Data Engineering, 2000, 12(3): 372–390. [doi: 10.
                     1109/69.846291]
                  [5]  Guo YH, Tong YH, Tang SW, Yang DQ. Inverse frequent itemset mining based on FP-Tree. Ruan Jian Xue Bao/Journal of Software,
                     2008, 19(2): 338–350 (in Chinese with English abstract). https://jos.org.cn/jos/article/abstract/20080215 [doi: 10.3724/SP.J.1001.2008.
                     00338]
                  [6]  Ding JM, Li HB, Deng B, Jia LY, You JG. Fast mining algorithm of frequent itemset based on spark. Ruan Jian Xue Bao/Journal of
                     Software, 2023, 34(5): 2446–2464 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6404.htm [doi: 10.13328/j.cnki.
                     jos.006404]
                  [7]  Fournier-Viger P, Yang P, Kiran RU, Ventura S, Luna JM. Mining local periodic patterns in a discrete sequence. Information Sciences,
                     2021, 544: 519–548. [doi: 10.1016/j.ins.2020.09.044]
                  [8]  Fournier-Viger  P,  Tseng  VS.  Mining  top-K  non-redundant  association  rules.  In:  Proc.  of  the  20th  Int’l  Symp.  on  Methodologies  for
                     Intelligent Systems. Macao: Springer, 2012. 31–40. [doi: 10.1007/978-3-642-34624-8_4]
                  [9]  Agouti  T.  Graph-based  modeling  using  association  rule  mining  to  detect  influential  users  in  social  networks.  Expert  Systems  with
                     Applications, 2022, 202: 117436. [doi: 10.1016/j.eswa.2022.117436]
                 [10]  Yao QY, Yang H, Bao BW, Yu A, Zhang J, Cheriet M. Core and spectrum allocation based on association rules mining in spectrally and
                     spatially elastic optical networks. IEEE Trans. on Communications, 2021, 69(8): 5299–5311. [doi: 10.1109/TCOMM.2021.3082768]
                 [11]  Agrawal R, Srikant R. Mining sequential patterns. In: Proc. of the 11th Int’l Conf. on Data Engineering. Taipei: IEEE, 1995. 3–14. [doi:
                     10.1109/ICDE.1995.380415]
                 [12]  Han J, Pei JW, Mortazavi-Asl B, Pinto H, Chen QM, Dayal U, Hsu MC. PrefixSpan: Mining sequential patterns efficiently by prefix-
                     projected pattern growth. In: Proc. of the 17th Int’l Conf. on Data Engineering. Heidelberg: IEEE, 2001. 215–224. [doi: 10.1109/ICDE.
                     2001.914830]
                 [13]  Huynh HM, Nguyen LTT, Pham NN, Oplatková ZK, Yun U, Vo B. An efficient method for mining sequential patterns with indices.
                     Knowledge-based Systems, 2022, 239: 107946. [doi: 10.1016/j.knosys.2021.107946]
                 [14]  Zhang W, Yoshida T, Tang XJ, Wang Q. Text clustering using frequent itemsets. Knowledge-based Systems, 2010, 23(5): 379–388. [doi:
                     10.1016/j.knosys.2010.01.011]
                 [15]  Zhang  LK,  Yang  GF.  Cluster  analysis  of  PM 2.5   pollution  in  China  using  the  frequent  itemset  clustering  approach.  Environmental
                     Research, 2022, 204: 112009. [doi: 10.1016/j.envres.2021.112009]
                 [16]  Dong GZ, Li JY. Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. of the 5th ACM SIGKDD Int’l
                     Conf. on Knowledge Discovery and Data Mining. San Diego: ACM, 1999. 43–52. [doi: 10.1145/312129.312191]
                 [17]  Zhang CS, Liu CC, Zhang XL, Almpanidis G. An up-to-date comparison of state-of-the-art classification algorithms. Expert Systems with
                     Applications, 2017, 82: 128–150. [doi: 10.1016/j.eswa.2017.04.003]
   454   455   456   457   458   459   460   461   462   463   464