Page 286 - 《软件学报》2025年第7期
P. 286

秦政 等: 面向  Apache Flink  流式分析应用的高吞吐优化技术                                          3207


                     1145/2882903.2882906]
                 [26]  Mai L, Zeng K, Potharaju R, Xu L, Suh S, Venkataraman S, Costa P, Kim T, Muthukrishnan S, Kuppa V, Dhulipalla S, Rao S. Chi: A
                     scalable  and  programmable  control  plane  for  distributed  stream  processing  systems.  Proc.  of  the  VLDB  Endowment,  2018,  11(10):
                     1303–1316. [doi: 10.14778/3231751.3231765]
                 [27]  Varga B, Balassi M, Kiss A. Towards autoscaling of Apache Flink jobs. Acta Universitatis Sapientiae, Informatica, 2021, 13(1): 39–59.
                     [doi: 10.2478/ausi-2021-0003]
                 [28]  Arkian HR, Pierre G, Tordsson J, Elmroth E. Model-based stream processing auto-scaling in geo-distributed environments. In: Proc. of
                     the 2021 Int’l Conf. on Computer Communications and Networks (ICCCN). Athens: IEEE, 2021. 1–10. [doi: 10.1109/ICCCN52240.2021.
                     9522236]
                 [29]  He CL, Huang Y, Wang CY, Wang N. Dynamic data partitioning strategy based on heterogeneous Flink cluster. In: Proc. of the 5th Int’l
                     Conf. on Artificial Intelligence and Big Data (ICAIBD). Chengdu: IEEE, 2022. 355–360. [doi: 10.1109/ICAIBD55127.2022.9820336]
                 [30]  Tangwongsan K, Hirzel M, Schneider S. Optimal and general out-of-order sliding-window aggregation. Proc. of the VLDB Endowment,
                     2019, 12(10): 1167–1180. [doi: 10.14778/3339490.3339499]
                 [31]  Shahvarani A, Jacobsen HA. Parallel index-based stream join on a multicore CPU. In: Proc. of the 2020 ACM SIGMOD Int’l Conf. on
                     Management of Data. Portland: ACM, 2020. 2523–2537. [doi: 10.1145/3318464.3380576]
                 [32]  Karimov J, Rabl T, Markl V. AStream: Ad-hoc shared stream processing. In: Proc. of the 2019 Int’l Conf. on Management of Data.
                     Amsterdam: ACM, 2019. 607–622. [doi: 10.1145/3299869.3319884]
                 [33]  Karimov J, Rabl T, Markl V. AJoin: Ad-hoc stream joins at scale. Proc. of the VLDB Endowment, 2019, 13(4): 435–448. [doi: 10.14778/
                     3372716.3372718]
                 [34]  McSherry  F,  Lattuada  A,  Schwarzkopf  M,  Roscoe  T.  Shared  arrangements:  Practical  inter-query  sharing  for  streaming  dataflows.
                     arXiv:1812.02639, 2020.
                 [35]  Zhang XQ, Ma K. Toward sliding time window of low watermark to detect delayed stream arrival. In: Proc. of the 16th EAI Int’l Conf.
                     on Collaborative Computing: Networking, Applications and Worksharing. Shanghai: Springer, 2021. 444–454. [doi: 10.1007/978-3-030-
                     67540-0_28]
                 [36]  Yue XF, Shi L, Zhao YH, Ji HX, Wang GR. Dynamic resource allocation strategy for Flink iterative jobs. Ruan Jian Xue Bao/Journal of
                     Software, 2022, 33(3): 985–1004 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6447.htm [doi: 10.13328/j.cnki.jos.
                     006447]
                 [37]  Shaikh SA, Mariam K, Kitagawa H, Kim KS. GeoFlink: A distributed and scalable framework for the real-time processing of spatial
                     streams. In: Proc. of the 29th ACM Int’l Conf. on Information & Knowledge Management. Virtual Event: ACM, 2020. 3149–3156. [doi:
                     10.1145/3340531.3412761]
                 [38]  Putatunda S, Laha AK. Travel time prediction in real time for GPS taxi data streams and its applications to travel safety. Human-centric
                     Intelligent Systems, 2023, 3: 381–401. [doi: 10.1007/s44230-023-00028-0]
                 [39]  Apache Kafka. 2023. https://kafka.apache.org/
                 [40]  Redis. 2024. https://redis.io/
                 [41]  Akidau T, Begoli E, Chernyak S, Hueske F, Knight K, Knowles K, Mills D, Sotolongo D. Watermarks in stream processing systems:
                     Semantics  and  comparative  analysis  of  apache  Flink  and  google  cloud  dataflow.  Proc.  of  the  VLDB  Endowment,  2021,  14(12):
                     3135–3147. [doi: 10.14778/3476311.3476389]
                 [42]  Wilmanns  PS,  Geuns  SJ,  Hausmans  JPHM,  Bekooij  MJG.  Buffer  sizing  to  reduce  interference  and  increase  throughput  of  real-time
                     stream processing applications. In: Proc. of the 18th IEEE Int’l Symp. on Real-time Distributed Computing. Auckland: IEEE, 2015. 9–18.
                     [doi: 10.1109/ISORC.2015.14]
                 [43]  Gulisano V, Palyvos-Giannas D, Havers B, Papatriantafilou M. The role of event-time order in data streaming analysis. In: Proc. of the
                     14th ACM Int’l Conf. on Distributed and Event-based Systems. Montreal: ACM, 2020. 214–217. [doi: 10.1145/3401025.3404088]
                 [44]  Dahlgaard S, Knudsen MBT, Thorup M. Practical hash functions for similarity estimation and dimensionality reduction. In: Proc. of the
                     31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6618–6628.
                 [45]  Aumayr D, Marr S, Gonzalez Boix E, Mössenböck H. Asynchronous snapshots of actor systems for latency-sensitive applications. In:
                     Proc. of the 16th ACM SIGPLAN Int’l Conf. on Managed Programming Languages and Runtimes. Athens: ACM, 2019. 157–171. [doi:
                     10.1145/3357390.3361019]
                 [46]  Chandy KM, Lamport L. Distributed snapshots: Determining global states of distributed systems. ACM Trans. on Computer Systems
                     (TOCS), 1985, 3(1): 63–75. [doi: 10.1145/214451.214456]
                 [47]  Performance Tuning. 2023. https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/tuning/
   281   282   283   284   285   286   287   288   289   290   291