Page 259 - 《软件学报》2020年第10期
P. 259

张伟  等:一种时间序列鉴别性特征字典构建算法                                                          3235
























                        Fig.12    Top-10 discriminant features generated by VLWEA on dataset CBF
                              图 12   数据集 CBF 上 VLWEA 生成的 top-10 鉴别性特征
         4    结   论

             在进行时间序列数据挖掘之前,对时间序列数据重新表示是一个重要的研究课题,其目的是,一方面通过减
         少算法实际处理数据的量来提高算法的运行速度,另一方面,充分表达原始时间序列数据的本质内容以提高分
         类精度.针对目前基于 SFA 的时间序列进行离散化表示方法存在的问题,本文提出了一种可变长度单词抽取算
         法,该算法可以有效学习不同滑动窗口对应的最优单词长度.与此同时,针对特征字典规模巨大的问题,本文定
         义了一种新的鉴别性特征选择统计量,并设计了一种动态阈值设定机制来对生成的特征进行选择,该方法在有
         效缩小特征字典规模的同时,可以获得较高的分类精度.

         References:
          [1]    Gulisano  V, Jerzak  Z, Voulgaris S,  Ziekow  H.  The  DEBS  2016 grand  challenge. In: Proc. of the 10th  ACM Int’l  Conf. on
             Distributed and Event-Based Systems (DEBS 2016). New York: ACM Press, 2016. 289–292. [doi: 10.1145/2933267.2933519]
          [2]    Patri O, Wojnowicz M,  Wolff M. Discovering malware with time series Shapelets. In:  Proc.  of the 50th Hawaii Int’l Conf. on
             System Sciences. AIS Electronic Library, 2017. 6079–6088. [doi: 10.24251/HICSS.2017.734]
          [3]    Zhu L, Lu C, Sun Y. Time series shapelet classification based online short-term voltage stability assessment. IEEE Trans. on Power
             Systems, 2016,31(2):1430–1439. [doi: 10.1109/tpwrs.2015.2413895]
          [4]    Esling P, Agon C. Time-series data mining. ACM Computing Surveys, 2012,45(1):1–34. [doi: 10.1145/2379776.2379788]
          [5]    Bagnall A, Lines J, Bostrom A, Large G,  Keogh E. The  great time series classification  bake  off: A  review and experimental
             evaluation  of  recent algorithmic advances. Data Mining and Knowledge Discovery,  2017,31(3):606–660. [doi:  10.1007/s10618-
             016-0483-9]
          [6]    Lin J, Khade  R,  Li  Y.  Rotation-invariant  similarity  in time series using bag-of-patterns representation.  Journal of Intelligent
             Information Systems, 2012,39(2):287–315. [doi: 10.1007/s10844-012-0196-5]
          [7]    Ding H, Trajcevski G, Scheuermann P, Wang XY, Keogh E. Querying and mining of time series data: Experimental comparison of
             representations and distance measures. Proc. of the VLDB Endowment, 2008,1(2):1542–1552. [doi: 10.14778/1454159.1454226]
          [8]    Yuan  JD, Wang ZH,  Sun  YG, Zhang W.  K-nearest neighbor  classifier for  complex time series.  Ruan Jian  Xue  Bao/Journal of
             Software, 2017,28(11):3002–3017 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5331.htm [doi: 10.13328/j.
             cnki.jos.005331]
          [9]    Gorecki  T.  Classification of time series using  combination of  DTW  and LCSS dissimilarity  measures.  Communications in
             Statistics-simulation and Computation, 2018,47(1):263–276. [doi: 10.1080/03610918.2017.1280829]
   254   255   256   257   258   259   260   261   262   263   264