Page 77 - 《软件学报》2020年第10期

P. 77

张鸿骏等:一种适应 GPU 的混合访问缓存索引框架 3053

6 结束语

数据索引的性能依赖于平均访问内存数.传统的 CHT 数据索引在缓存中应用时,当缓存空间大小与 CHT
空间比值较高时,CHT 频繁的踢出替换方法增加了内存访问数,进而对缓存性能造成了影响.本文分析了基于键
值的内存缓存系统命中率与索引开销因素,依据索引指针开销给出了双重 LRU CCHT 和粗粒度 LRU CCHT,提
供了用于 CPUGPU 环境下减少总线传输与 GPU 内存占用的多级索引数据结构,并在 CPUGPU 异构环境下加
以实现.通过实验验证,CCHT 在保证缓存命中率的同时,能够有效地减少内存访问次数,在 GPU 上有良好的可
扩展性与吞吐性能.
未来的工作我们希望进一步优化索引指针的开销,提升缓存的命中率.在 GPU 的实现方法中,考虑通过优
化槽单元锁和 LRU 队列锁以提升并发插入的性能,在散列任务提交任务队列前识别任务类型,预先分配任务执
行所在的 warp,达到同一 warp 中并发执行多个散列表操作,并达到线程粒度的并发性能,提升 GPU 硬件资源利
用率.

References:
[1] Boncz PA, Manegold S, Kersten ML. Database architecture optimized for the new bottleneck: Memory access. VLDB, 1999,99:
54–65.
[2] DeWitt DJ, Katz RH, Olken F, et al. Implementation techniques for main memory database systems. In: Proc. of the 1984 ACM
SIGMOD Int’l Conf. on Management of Data. 1984. 1–8.
[3] Kemper A, Neumann T. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In:
Proc. of the 27th IEEE Int’l Conf. on Data Engineering. 2011. 195–206.
[4] Lim H, Han D, Andersen DG, Kaminsky M. MICA: A holistic approach to fast in-memory key-value storage. In: Proc. of the 11th
USENIX Symp. on Networked Systems Design and Implementation. 2014. 429−444.
[5] Fitzpatrick B. Distributed caching with memcached. Linux Journal, 2004,2004(124):5.
[6] Ma YZ, Meng XF. Research on indexing for cloud data management. Ruan Jian Xue Bao/Journal of Software, 2015,26(1):145−166
(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4688.htm [doi: 10.13328/j.cnki.jos.004688]
[7] Ousterhout J, Gopalan A, Gupta A, et al. The RAMCloud storage system. ACM Trans. on Computer Systems (TOCS), 2015,33(3):
1–55.
[8] Fan B, Andersen DG, Kaminsky M. Memc3: Compact and concurrent memcache with dumber caching and smarter hashing. In:
Proc. of the Presented as Part of the 10th USENIX Symposium on Networked Systems Design and Implementation. 2013. 371–384.
[9] Pagh R, Rodler FF. Cuckoo hashing. In: Proc. of the European Symp. on Algorithms. Berlin, Heidelberg: Springer-Verlag, 2001.
121–133.
[10] Breslow AD, Zhang DP, Greathouse JL, et al. Horton tables: Fast hash tables for in-memory data-intensive computing. In: Proc. of
the 2016 USENIX Annual Technical Conf. 2016. 281–294.
[11] Khorasani F, Belviranli ME, Gupta R, et al. Stadium hashing: Scalable and flexible hashing on GPUs. In: Proc. of the 2015 Int’l
Conf. on Parallel Architecture and Compilation (PACT). IEEE, 2015. 63–74.
[12] Barber R, Lohman G, Pandis I, et al. Memory-efficient hash joins. Proc. of the VLDB Endowment, 2014,8(4):353–364.
[13] Lim H, Fan B, Andersen DG, et al. SILT: A memory-efficient, high-performance key-value store. In: Proc. of the 23rd ACM Symp.
on Operating Systems Principles. 2011. 1–13.
[14] Herlihy M, Shavit N, Tzafrir M. Hopscotch hashing. In: Proc. of the Int’l Symp. on Distributed Computing. Berlin, Heidelberg:
Springer-Verlag, 2008. 350–364.
[15] Pagh R, Wei Z, Yi K, et al. Cache-oblivious hashing. Algorithmica, 2014,69(4):864–883.
[16] Debnath B, Sengupta S, Li J. FlashStore: High throughput persistent key-value store. Proc. of the VLDB Endowment, 2010,
3(1-2):1414–1425.
[17] Atikoglu B, Xu Y, Frachtenberg E, et al. Workload analysis of a large-scale key-value store. In: Proc. of the 12th ACM
SIGMETRICS/PERFORMANCE Joint Int’l Conf. on Measurement and Modeling of Computer Systems. 2012. 53–64.

72 73 74 75 76 77 78 79 80 81 82