Page 362 - 《软件学报》2025年第8期
P. 362

郭娜 等: DRAMA: 更新分布感知的学习型索引                                                       3785


                 更新分布的学习型索引方法. 在          5  个数据集上的    6  个性能指标下的实验结果表明, 利用数据更新分布可以加速索
                 引查询性能. 本文使用的延迟学习思想可以周期性地主动感知数据的更新分布, 从而克服了无法提前获取数据更
                 新分布的问题, 有效地适应了随机插入环境. 近似拟合更新分布的方法加速了由于频繁数据更新导致的模型重新
                 训练过程, 进一步提高了索引的查询性能. 当触发模型合并时, 无须使用最小二乘法即可快速合并模型. 本文采用
                 的  LLPC  无损参数压缩策略有效地降低了索引的内存占用.


                 References:
                  [1]  Cai P, Zhang SM, Liu PR, Sun LM, Li CP, Chen H. An overview of learned index technologies for intelligent database. Chinese Journal
                     of Computers, 2023, 46(1): 51–69 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2023.00051]
                  [2]  Sun ZY, Zhou XH, Li GL. Learned index: A comprehensive experimental evaluation. Proc. of the VLDB Endowment, 2023, 16(8):
                     1992–2004. [doi: 10.14778/3594512.3594528]
                  [3]  Kipf A, Marcus R, van Renen A, Stoian M, Kemper A, Kraska T, Neumann T. RadixSpline: A single-pass learned index. In: Proc. of the
                     3rd Int’l Workshop on Exploiting Artificial Intelligence Techniques for Data Management. Portland: ACM, 2020. 1–5. [doi: 10.1145/
                     3401071.3401659]
                  [4]  Li PF, Lu H, Zheng Q, Yang L, Pan G. LISA: A learned index structure for spatial data. In: Proc. of the 2020 ACM SIGMOD Int’l Conf.
                     on Management of Data. New York: ACM, 2020. 2119–2133. [doi: 10.1145/3318464.3389703]
                  [5]  Nathan V, Ding JL, Alizadeh M, Kraska T. Learning muti-dimensional indexes. In: Proc. of the 2020 ACM SIGMOD Int’l Conf. on
                     Management of Data. Portland: ACM, 2020. 985–1000. [doi: 10.1145/3318464.3380579]
                  [6]  Ding JL, Nathan V, Alizadeh M, Kraska T. Tsunami: A learned multi-dimensional index for correlated data and skewed workloads. Proc.
                     of the VLDB Endowment, 2020, 14(2): 74–86. [doi: 10.14778/3425879.3425880]
                  [7]  Gao J, Cao X, Yao X, Zhang G, Wang W. LMSFC: A novel multidimensional index based on learned monotonic space filling curves.
                     Proc. of the VLDB Endowment, 2023, 16(10): 2605–2617. [doi: 10.14778/3603581.3603598]
                  [8]  Yu T, Liu G F, Liu A, Li Z X, Zhao L. LIFOSS: A learned index scheme for streaming scenarios. World Wide Web, 2023, 26(1):
                     501–518. [doi: 10.1007/s11280-022-01021-6]
                  [9]  Yang G, Liang L, Hadian A, Heinis T. FLIRT: A fast learned index for rolling time frame. In: Proc. of the 26th Int’l Conf. on Extending
                     Database Technology. 2023. 234–246. [doi: 10.48786/edbt.2023.19]
                 [10]  Wu S, Li Y, Zhu HQ, Zhao JB, Chen G. Dynamic index construction with deep reinforcement learning. Data Science and Engineering,
                     2022, 7(2): 87–101. [doi: 10.1007/s41019-022-00186-4]
                 [11]  Gu T, Feng KY, Cong G, Long C, Wang Z, Wang S. The RLR-Tree: A reinforcement learning based R-tree for spatial data. Proc. of the
                     ACM on Management of Data, 2023, 1(1): 63. [doi: 10.1145/3588917]
                 [12]  Database Architects. The case for B-tree index structures. 2017. http://databasearchitects.blogspot.com/2017/12/the-case-for-b-tree-index-
                     structures.html
                 [13]  Stanford DAWN Cuckoo Hashing. 2017. https://github.com/stanford-futuredata/index-baselines
                 [14]  Ding JL, Minhas UF, Yu J, Wang C, Do J, Li YN, Zhang HT, Chandramouli B, Gehrke J, Kossmann D, Lomet DB, Kraska T. ALEX: An
                     updatable  adaptive  learned  index.  In:  Proc.  of  the  2020  ACM  SIGMOD  Int’l  Conf.  on  Management  of  Data.  Portland:  ACM,  2020.
                     969–984. [doi: 10.1145/3318464.3389711]
                 [15]  Ferragina P, Vinciguerra G. The PGM-index: A fully-dynamic compressed learned index with provable worst-case bounds. Proc. of the
                     VLDB Endowment, 2020, 13(8): 1162–1175. [doi: 10.14778/3389133.3389135]
                 [16]  Wu JC, Zhang Y, Chen SM, Wang J, Chen Y, Xing CX. Updatable learned index with precise positions. Proc. of the VLDB Endowment,
                     2021, 14(8): 1276–1288. [doi: 10.14778/3457390.3457393]
                 [17]  Choi D, Yoon H, Lee H, Chung YD. Waffle: In-memory grid index for moving objects with reinforcement learning-based configuration
                     tuning system. Proc. of the VLDB Endowment, 2022, 15(11): 2375–2388. [doi: 10.14778/3551793.3551800]
                 [18]  Shin J, Wang JG, Aref WG. The LSM RUM-tree: A log structured merge R-tree for update-intensive spatial workloads. In: Proc. of the
                     37th IEEE Int’l Conf. on Data Engineering (ICDE). Chania: IEEE, 2021. 2285–2290. [doi: 10.1109/ICDE51399.2021.00238]
                 [19]  O’Neil PE, Cheng E, Gawlick D, O’Neil E. The log-structured merge-tree (LSM-tree). Acta Informatica, 1996, 33(4): 351–385. [doi: 10.
                     1007/s002360050048]
                 [20]  Crotty A. Hist-tree: Those who ignore it are doomed to learn. In: Proc. of the 11th Annual Conf. on Innovative Data Systems Research
                     (CIDR 2021). 2021.
                 [21]  Leis V, Kemper A, Neumann T. The adaptive radix tree: ARTful indexing for main-memory databases. In: Proc. of  the 29th IEEE Int’l
   357   358   359   360   361   362   363   364   365   366   367