Page 259 - 《软件学报》2020年第10期
P. 259
张伟 等:一种时间序列鉴别性特征字典构建算法 3235
Fig.12 Top-10 discriminant features generated by VLWEA on dataset CBF
图 12 数据集 CBF 上 VLWEA 生成的 top-10 鉴别性特征
4 结 论
在进行时间序列数据挖掘之前,对时间序列数据重新表示是一个重要的研究课题,其目的是,一方面通过减
少算法实际处理数据的量来提高算法的运行速度,另一方面,充分表达原始时间序列数据的本质内容以提高分
类精度.针对目前基于 SFA 的时间序列进行离散化表示方法存在的问题,本文提出了一种可变长度单词抽取算
法,该算法可以有效学习不同滑动窗口对应的最优单词长度.与此同时,针对特征字典规模巨大的问题,本文定
义了一种新的鉴别性特征选择统计量,并设计了一种动态阈值设定机制来对生成的特征进行选择,该方法在有
效缩小特征字典规模的同时,可以获得较高的分类精度.
References:
[1] Gulisano V, Jerzak Z, Voulgaris S, Ziekow H. The DEBS 2016 grand challenge. In: Proc. of the 10th ACM Int’l Conf. on
Distributed and Event-Based Systems (DEBS 2016). New York: ACM Press, 2016. 289–292. [doi: 10.1145/2933267.2933519]
[2] Patri O, Wojnowicz M, Wolff M. Discovering malware with time series Shapelets. In: Proc. of the 50th Hawaii Int’l Conf. on
System Sciences. AIS Electronic Library, 2017. 6079–6088. [doi: 10.24251/HICSS.2017.734]
[3] Zhu L, Lu C, Sun Y. Time series shapelet classification based online short-term voltage stability assessment. IEEE Trans. on Power
Systems, 2016,31(2):1430–1439. [doi: 10.1109/tpwrs.2015.2413895]
[4] Esling P, Agon C. Time-series data mining. ACM Computing Surveys, 2012,45(1):1–34. [doi: 10.1145/2379776.2379788]
[5] Bagnall A, Lines J, Bostrom A, Large G, Keogh E. The great time series classification bake off: A review and experimental
evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery, 2017,31(3):606–660. [doi: 10.1007/s10618-
016-0483-9]
[6] Lin J, Khade R, Li Y. Rotation-invariant similarity in time series using bag-of-patterns representation. Journal of Intelligent
Information Systems, 2012,39(2):287–315. [doi: 10.1007/s10844-012-0196-5]
[7] Ding H, Trajcevski G, Scheuermann P, Wang XY, Keogh E. Querying and mining of time series data: Experimental comparison of
representations and distance measures. Proc. of the VLDB Endowment, 2008,1(2):1542–1552. [doi: 10.14778/1454159.1454226]
[8] Yuan JD, Wang ZH, Sun YG, Zhang W. K-nearest neighbor classifier for complex time series. Ruan Jian Xue Bao/Journal of
Software, 2017,28(11):3002–3017 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5331.htm [doi: 10.13328/j.
cnki.jos.005331]
[9] Gorecki T. Classification of time series using combination of DTW and LCSS dissimilarity measures. Communications in
Statistics-simulation and Computation, 2018,47(1):263–276. [doi: 10.1080/03610918.2017.1280829]