Page 398 - 《软件学报》2025年第9期
P. 398
王尚 等: 基于神经网络的分布式追踪数据压缩和查询方法 4309
[20] Huffman D A. A method for the construction of minimum-redundancy codes. Proc. of the IRE, 1952, 40(9): 1098–1101. [doi: 10.1109/
JRPROC.1952.273898]
[21] Grafana Labs. Grafana Tempo. 2024. https://grafana.com/oss/tempo/
[22] Jaeger. Jaeger: Open source, distributed tracing platform. 2024. https://www.jaegertracing.io/
[23] Witten IH, Neal RM, Cleary JG. Arithmetic coding for data compression. Communications of the ACM, 1987, 30(6): 520–540. [doi: 10.
1145/214762.214771]
[24] Merhav N, Gutman M, Ziv J. On the estimation of the order of a Markov chain and universal data compression. IEEE Trans. on
Information Theory, 1989, 35(5): 1014–1019. [doi: 10.1109/18.42210]
[25] Deutsch P. DEFLATE compressed data format specification version 1.3. 1996. [doi: 10.17487/RFC1951]
[26] Moffat A. Implementing the PPM data compression scheme. IEEE Trans. on Communications, 1990, 38(11): 1917–1921. [doi: 10.1109/
26.61469]
[27] Goyal M, Tatwawadi K, Chandak S, Ochoa I. DZip: Improved general-purpose loss less compression based on novel neural network
modeling. In: Proc. of the 2021 Data Compression Conf. (DCC). Snowbird: IEEE, 2021. 153–162. [doi: 10.1109/DCC50243.2021.00023]
[28] Schmidhuber J, Heil S. Sequential neural text compression. IEEE Trans. on Neural Networks, 1996, 7(1): 142–146. [doi: 10.1109/72.
478398]
[29] Mahoney MV. Fast text compression with neural networks. In: Proc. of the 13th Int’l Florida Artificial Intelligence Research Society
Conf. Orlando: AAAI Press, 2000. 230–234.
[30] Liu Q, Xu YL, Li Z. DecMac: A deep context model for high efficiency arithmetic coding. In: Proc. of the 2019 Int’l Conf. on Artificial
Intelligence in Information and Communication (ICAIIC). Okinawa: IEEE, 2019. 438–443. [doi: 10.1109/ICAIIC.2019.8668843]
[31] Bellard F. Lossless data compression with neural networks. 2019. https://bellard.org/nncp/nncp.pdf
[32] Goyal M, Tatwawadi K, Chandak S, Ochoa I DeepZip: Lossless data compression using recurrent neural networks. arXiv:1811.08162,
2018.
[33] Mao Y, Cui YF, Kuo TW, Xue CJ. TRACE: A fast transformer-based general-purpose lossless compressor. In: Proc. of the 2022 ACM
Web Conf. ACM, 2022. 1829–1838. [doi: 10.1145/3485447.3511987]
[34] Wei JY, Zhang GY, Wang Y, Liu ZW, Zhu ZY, Chen JC, Sun TT, Zhou Q. On the feasibility of parser-based log compression in large-
scale cloud systems. In: Proc. of the 19th USENIX Conf. on File and Storage Technologies. USENIX Association, 2021. 249–262.
[35] Yao KD, Sayagh M, Shang WY, Hassan AE. Improving state-of-the-art compression techniques for log management tools. IEEE Trans.
on Software Engineering, 2022, 48(8): 2748–2760. [doi: 10.1109/TSE.2021.3069958]
[36] Christensen R, Li FF. Adaptive log compression for massive log data. In: Proc. of the 2013 ACM SIGMOD Int’l Conf. on Management
of Data. New York: ACM, 2013. 1283–1284. [doi: 10.1145/2463676.2465341]
[37] Lin H, Zhou JY, Yao B, Guo MY, Li J. Cowic: A column-wise independent compression for log stream analysis. In: Proc. of the 15th
IEEE/ACM Int’l Symp. on Cluster, Cloud and Grid Computing. Shenzhen: IEEE, 2015. 21–30. [doi: 10.1109/CCGrid.2015.45]
[38] Liu JY, Zhu JM, He SL, He PJ, Zheng ZB, Lyu MR. Logzip: Extracting hidden structures via iterative clustering for log compression. In:
Proc. of the 34th IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). San Diego: IEEE, 2019. 863–873. [doi: 10.1109/
ASE.2019.00085]
[39] Rodrigues K, Luo Y, Yuan D. CLP: Efficient and scalable search on compressed text logs. In: Proc. of the 15th USENIX Symp. on
Operating Systems Design and Implementation. USENIX Association, 2021. 183–198.
[40] Ding HL, Yan S, Zhai J, Ma SQ. ELISE: A storage efficient logging system powered by redundancy reduction and representation
learning. In: Proc. of the 30th USENIX Security Symp. USENIX Association, 2021. 3023–3040.
[41] Li XY, Zhang HY, Le VH, Chen PF. LogShrink: Effective log compression by leveraging commonality and variability of log data. In:
Proc. of the 46th IEEE/ACM Int’l Conf. on Software Engineering. Lisbon: ACM, 2024. 23. [doi: 10.1145/3597503.3608129]
[42] Agarwal R, Khandelwal A, Stoica I. Succinct: Enabling queries on compressed data. In: Proc. of the 12th USENIX Symp. on Networked
Systems Design and Implementation. Oakland: USENIX Association, 2015. 337–350.
[43] Pibiri GE, Petri M, Moffat A. Fast dictionary-based compression for inverted indexes. In: Proc. of the 12th ACM Int’l Conf. on Web
Search and Data Mining. Melbourne: ACM, 2019. 6–14. [doi: 10.1145/3289600.3290962]
[44] Zhang F, Zhai JD, Shen XP, Wang DL, Chen Z, Mutlu O, Chen WG, Du XY. TADOC: Text analytics directly on compression. The
VLDB Journal, 2021, 30(2): 163–188. [doi: 10.1007/s00778-020-00636-3]
[45] Wei JY, Zhang GY, Chen JC, Wang Y, Zheng WM, Sun TT, Wu JS, Jiang JW. LogGrep: Fast and cheap cloud log storage by exploiting
both static and runtime patterns. In: Proc. of the 18th European Conf. on Computer Systems. Rome: ACM, 2023. 452–468. [doi: 10.1145/
3552326.3567484]

