Page 182 - 《软件学报》2021年第7期
P. 182

2100                                     Journal of Software  软件学报 Vol.32, No.7,  July 2021

                [20]    Hargrove PH, Duell JC. Berkeley laboratory checkpoint/restart (BLCR) for Linux clusters. Journal of Physics (Conf. Series), 2006,
                     46(1):494–9. [doi: 10.1088/1742-6596/46/1/067]
                [21]    Plank  JS, Kai L. ICKP:  A consistent checkpointer  for multicomputers. IEEE  Parallel  & Distributed  Technology:  Systems &
                     Applications, 1994,2(2):62–67. [doi: 10.1109/88.311574]
                [22]    Plank JS, Kai L, Puening MA. Diskless checkpointing. IEEE Trans. on Parallel and Distributed Systems, 1998,9(10):972–986. [doi:
                     10.1109/71.730527]
                [23]    Sankaran  S,  Squyres JM, Barrett B, Sahay V, Lumsdaine A, Duell  J,  Hargrove  P,  Roman E. The LAM/MPI  checkpoint/restart
                     framework: System-initiated checkpointing. The Int’l Journal of High Performance Computing Applications, 2005,19(4):479–493.
                     [doi: 10.1177/1094342005056139]
                [24]    Zheng G, Ni X, Kalé LV. A scalable double in-memory checkpoint and restart scheme towards exascale. In: Proc. of the IEEE/IFIP
                     Int’l Conf. on  Dependable Systems  and Networks  Workshops (DSN 2012).  Boston, 2012. 1–6. [doi: 10.1109/DSNW.2012.
                     6264677]
                [25]    Heidari S, Simmhan Y, Calheiros RN, Buyya R. Scalable graph processing frameworks: A taxonomy and open challenges. ACM
                     Computing Surveys (CSUR), 2018,51(3):1–53.
                [26]    Mccune  RR, Weninger T, Madey G. Thinking like  a  Vertex: A survey  of vertex-centric frameworks for large-scale distributed
                     graph processing. ACM Computing Surveys (CSUR), 2015,48(2):1–39.
                [27]    Dean J, Ghemawat S.  MapReduce:  Simplified data processing on  large  clusters.  Communications of the  ACM, 2008,51(1):
                     107–113.
                [28]    Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I. Resilient distributed datasets: A
                     fault-tolerant  abstraction for in-memory  cluster  computing. In: Proc. of  the Presented  as  Part of the 9th  USENIX Symp. on
                     Networked Systems Design and Implementation (NSDI 12). 2012. 15–28.
                [29]    Stutz P, Bernstein A, Cohen W. Signal/collect: Graph algorithms for the (semantic) Web. In: Patel-Schneider PF, et al. eds. Proc. of
                     the Int’l Semantic Web Conf. (ISWC). Berlin: Springer-Verlag, 2010. 764–780.
                [30]    Bronevetsky G, Marques D, Pingali K, Stodghill P. Automated application-level checkpointing of MPI programs. In: Proc. of the
                     9th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming. New York: Association for Computing Machinery,
                     2003. 84–94. [doi :10.1145/781498.781513]
                [31]    Beguelin  A, Seligman  E, Stephan P. Application level fault  tolerance in heterogeneous networks of  workstations. Journal of
                     Parallel and Distributed Computing, 1997,43(2):147–155.
                [32]    Dathathri R, Gill G, Hoang L, Pingali K. Phoenix: A substrate for resilient distributed graph analytics. In: Proc. of the 24th Int’l
                     Conf. on  Architectural Support for  Programming  Languages  and  Operating Systems.  New York: Association for  Computing
                     Machinery, 2019. 615–630. [doi: 10.1145/3297858.3304056]
                [33]    Hoang L, Pontecorvi  M, Dathathri R, Gill G, You B,  Pingali K, Ramachandran V. A  round-efficient  distributed  betweenness
                     centrality algorithm. In: Proc. of the 24th Symp. on Principles and Practice of Parallel Programming (PPoPP 2019). New York:
                     Association for Computing Machinery, 2019. 272–286.
                [34]    Iyer AP, Liu Z, Jin X, Venkataraman S, Braverman V, Stoica I. ASAP: Fast, approximate graph pattern mining at scale. In: Proc. of
                     the 13th USENIX Conf. on Operating Systems Design and Implementation. Carlsbad: USENIX Association, 2018. 745–761.
                [35]    Zhang Y, Gao Q, Gao L, Wang C. Maiter: An asynchronous graph processing framework for delta-based accumulative iterative
                     computation. IEEE Trans. on Parallel and Distributed Systems, 2014,25(8):2091–2100. [doi: 10.1109/TPDS.2013.235]
                [36]    Wang Z, Gao L, Gu Y, Bao Y, Yu G. A fault-tolerant framework for asynchronous iterative computations in cloud environments. In:
                     Proc. of the 7th ACM Symp. on Cloud Computing. New York: Association for Computing Machinery, 2016. 71–83.
                [37]    Wang Z, Gao L, Gu Y, Bao Y, Yu G. A fault-tolerant framework for asynchronous iterative computations in cloud environments.
                     IEEE Trans. on Parallel and Distributed Systems, 2018,29(8):1678–1692.
                [38]    Avizienis A, Laprie JC, Randell B, Landwehr C. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans.
                     on Dependable and Secure Computing, 2004,1(1):11–33.
                [39]    Poola D, Salehi MA, Ramamohanarao K, Buyya R. Chapter 15—A Taxonomy and Survey of Fault-tolerant Workflow Management
                     Systems in Cloud and Distributed Computing Environments. Elsevier Inc., 2017. 285–320.
   177   178   179   180   181   182   183   184   185   186   187