Page 204 - 《软件学报》2020年第11期
P. 204

软件学报 ISSN 1000-9825, CODEN RUXUEW                                       E-mail: jos@iscas.ac.cn
                 Journal of Software,2020,31(11):3519−3539 [doi: 10.13328/j.cnki.jos.005826]   http://www.jos.org.cn
                 ©中国科学院软件研究所版权所有.                                                         Tel: +86-10-62562563


                                                         ∗
                 基于时序分区的时态索引与查询

                 杨佐希,   汤   娜,   汤   庸,   潘明明,   李丁丁,   叶小平


                 (华南师范大学  计算机学院,广东  广州  510631)
                 通讯作者:  汤庸, E-mail: ytang@m.scnu.edu.cn, http://www.scholat.com/ytang

                 摘   要:  时态索引作为一种高效管理和检索时态数据的有效手段,一直是时态数据领域的研究热点.提出了一种
                 基于时序分区的时态索引技术 TPindex.首先将海量时态数据的时态属性映射到二维平面上,对平面上的“有效时
                 间”点进行采样处理,通过使用自上而下,自左而右的时序分区方法将平面划分成若干个均匀的区域.其次,使用基于
                 拟序关系的线序划分算法对每个分区中的数据构建数据结构,并建立基于“有效时间戳”的全区索引,实现“一次一
                 集合”的数据查询操作.再次,还提出了使用分文件存储线序索引的模式将分区线序索引磁盘化,同时可以结合多线
                 程技术并行处理数据,充分利用现代化硬件资源以满足海量数据下的高性能需求,提高索引性能.另一方面,我们还
                 研究了海量时态数据下 TPindex 的增量式更新操作.最后,设计相应的仿真实验,通过与现有的代表性工作进行对比
                 评估,验证了所提出方法的有效性和实用价值.
                 关键词:  时态索引;时序分区;拟序关系;海量数据;并行化
                 中图法分类号: TP311

                 中文引用格式:  杨佐希,汤娜,汤庸,潘明明,李丁丁,叶小平.基于时序分区的时态索引与查询.软件学报,2020,31(11):
                 3519−3539. http://www.jos.org.cn/1000-9825/5826.htm
                 英文引用格式: Yang ZX, Tang N, Tang Y, Pan MM, Li DD, Ye XP. Temporal index and query based on timing partition. Ruan
                 Jian Xue Bao/Journal of Software, 2020,31(11):3519−3539 (in Chinese). http://www.jos.org.cn/1000-9825/5826.htm

                 Temporal Index and Query Based on Timing Partition

                 YANG Zuo-Xi,  TANG Na,   TANG Yong,   PAN Ming-Ming,   LI Ding-Ding,    YE Xiao-Ping
                 (School of Computer Science, South China Normal University, Guangzhou 510631, China)
                 Abstract:    Temporal index is one of key methods for temporal data managements and retrieval, which has been a hotspot in the field of
                 temporal data. This paper presents a temporal index technique TPindex which is based on a temporal timing partition method. Firstly, the
                 temporal attributes of massive amount of temporal data is mapped to a two-dimensional plane and the “Valid Time” points in this plane
                 are sampled  for  timing partition.  A  “form up to down  and form  left to right” timing partition  method is used to divide the plane into
                 several balanced temporal areas and whole-partition index would be established at the same time. Once the steps above are completed,
                 temporal data  can be dynamically indexed by its querying schema  of “one time,  one set”. Secondly,  the TPindex  would build data
                 structures through using “linear order partition” algorithm based on quasi-order relation for the data in each temporal area. Besides, a
                 “Separated Files Model Index” based on disks and multi-threading parallel process technique that can be combined are proposed to make
                 full use of modern hardware resources to meet the high performance needs under high-volume data, leading to better performance with
                 index. On the other hand, the incremental updating  algorithm  was  also studied. Finally, the  corresponding simulation  experiments are
                 designed to compare with the current representative work to verify the feasibility and validity of the proposed algorithm.

                    ∗ 基金项目:  国家自然科学基金(61772211, U181120009);  广州市产学研协同创新重大专项(201704020203);  广东省应用型科技
                 研发专项(2016B010124008)
                    Foundation  item:  National  Natural Science Foundation of  China (61772211,  U181120009);  Major Project of Industry-university-
                 research  Collaborative Innovation of Guangzhou  Municipality (201704020203); Special Fund for Applied  Technology  Research  and
                 Development of Guangdong Province (2016B010124008)
                     收稿时间: 2018-09-05;  修改时间: 2019-01-07;  采用时间: 2019-02-22
   199   200   201   202   203   204   205   206   207   208   209