Page 204 - 《软件学报》2020年第11期
P. 204
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2020,31(11):3519−3539 [doi: 10.13328/j.cnki.jos.005826] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
∗
基于时序分区的时态索引与查询
杨佐希, 汤 娜, 汤 庸, 潘明明, 李丁丁, 叶小平
(华南师范大学 计算机学院,广东 广州 510631)
通讯作者: 汤庸, E-mail: ytang@m.scnu.edu.cn, http://www.scholat.com/ytang
摘 要: 时态索引作为一种高效管理和检索时态数据的有效手段,一直是时态数据领域的研究热点.提出了一种
基于时序分区的时态索引技术 TPindex.首先将海量时态数据的时态属性映射到二维平面上,对平面上的“有效时
间”点进行采样处理,通过使用自上而下,自左而右的时序分区方法将平面划分成若干个均匀的区域.其次,使用基于
拟序关系的线序划分算法对每个分区中的数据构建数据结构,并建立基于“有效时间戳”的全区索引,实现“一次一
集合”的数据查询操作.再次,还提出了使用分文件存储线序索引的模式将分区线序索引磁盘化,同时可以结合多线
程技术并行处理数据,充分利用现代化硬件资源以满足海量数据下的高性能需求,提高索引性能.另一方面,我们还
研究了海量时态数据下 TPindex 的增量式更新操作.最后,设计相应的仿真实验,通过与现有的代表性工作进行对比
评估,验证了所提出方法的有效性和实用价值.
关键词: 时态索引;时序分区;拟序关系;海量数据;并行化
中图法分类号: TP311
中文引用格式: 杨佐希,汤娜,汤庸,潘明明,李丁丁,叶小平.基于时序分区的时态索引与查询.软件学报,2020,31(11):
3519−3539. http://www.jos.org.cn/1000-9825/5826.htm
英文引用格式: Yang ZX, Tang N, Tang Y, Pan MM, Li DD, Ye XP. Temporal index and query based on timing partition. Ruan
Jian Xue Bao/Journal of Software, 2020,31(11):3519−3539 (in Chinese). http://www.jos.org.cn/1000-9825/5826.htm
Temporal Index and Query Based on Timing Partition
YANG Zuo-Xi, TANG Na, TANG Yong, PAN Ming-Ming, LI Ding-Ding, YE Xiao-Ping
(School of Computer Science, South China Normal University, Guangzhou 510631, China)
Abstract: Temporal index is one of key methods for temporal data managements and retrieval, which has been a hotspot in the field of
temporal data. This paper presents a temporal index technique TPindex which is based on a temporal timing partition method. Firstly, the
temporal attributes of massive amount of temporal data is mapped to a two-dimensional plane and the “Valid Time” points in this plane
are sampled for timing partition. A “form up to down and form left to right” timing partition method is used to divide the plane into
several balanced temporal areas and whole-partition index would be established at the same time. Once the steps above are completed,
temporal data can be dynamically indexed by its querying schema of “one time, one set”. Secondly, the TPindex would build data
structures through using “linear order partition” algorithm based on quasi-order relation for the data in each temporal area. Besides, a
“Separated Files Model Index” based on disks and multi-threading parallel process technique that can be combined are proposed to make
full use of modern hardware resources to meet the high performance needs under high-volume data, leading to better performance with
index. On the other hand, the incremental updating algorithm was also studied. Finally, the corresponding simulation experiments are
designed to compare with the current representative work to verify the feasibility and validity of the proposed algorithm.
∗ 基金项目: 国家自然科学基金(61772211, U181120009); 广州市产学研协同创新重大专项(201704020203); 广东省应用型科技
研发专项(2016B010124008)
Foundation item: National Natural Science Foundation of China (61772211, U181120009); Major Project of Industry-university-
research Collaborative Innovation of Guangzhou Municipality (201704020203); Special Fund for Applied Technology Research and
Development of Guangdong Province (2016B010124008)
收稿时间: 2018-09-05; 修改时间: 2019-01-07; 采用时间: 2019-02-22