Page 80 - 《软件学报》2021年第8期
P. 80
2362 Journal of Software 软件学报 Vol.32, No.8, August 2021
[8] Sun Q, Zhang CY, Wu CM, Zhang JJ, Li LS. Bandwidth reduced parallel SpMV on the SW26010 many-core platform. In: Proc. of
the 47th Int’l Conf. on Parallel Processing. 2018. Article No.54.
[9] Ayguad E, Copty N, Duran A, Hoeflinger J, Lin Y, Massaioli F, Su E, Unnikrishnan P, Zhang G. A proposal for task parallelism in
OpenMP. In: Proc. of the Int’l Workshop on OpenMP. Berlin: Springer-Verlag, 2010. 1−12.
[10] Mller MS, Baron J, Brantley WC, Feng H, Hackenberg D, Henschel R, Jost G, Molka D, Parrott C, Robichaux J. SPEC
OMP2012⎯ An application benchmark suite for parallel systems using OpenMP. In: Proc. of the Int’l Conf. on OpenMP in a
Heterogeneous World. Berlin: Springer-Verlag, 2012. 223−236.
[11] Duran A, Corbaln J, Ayguad E. Evaluation of OpenMP task scheduling strategies. In: Proc. of the OpenMP in a New Era of
Parallelism Int’l Workshop. West Lafayette: Springer-Verlag, 2008. 100−110.
[12] OpenMP 4.0 Specification. 2012. http://www.openmp.org/uncategorized/openmp-40/
[13] Blumofe RD, Leiserson CE. Scheduling multithreaded computations by work stealing. Parallel & Distributed Computing, 1997,
45(2):148−158.
[14] Leiserson CE. Programming irregular parallel applications in cilk. In: Proc. of the Int’l Symp. on Solving Irregularly Structured
Problems in Parallel. Berlin: Springer-Verlag, 1997. 61−71.
[15] Frigo M, Leiserson CE, Randall KH. The implementation of the Cilk-5 multithreaded language. ACM Sigplan Notices, 1999,33(5):
212−223.
[16] Cav V, Zhao J, Shirako J, Sarkar V. Habanero-Java: The new adventures of Old X10. In: Proc. of the 9th Int’l Conf. on Principles
and Practice of Programming in Java. New York: ACM, 2011. 51−61.
[17] Berenbrink P, Friedetzky T, Goldberg LA. The natural work-stealing algorithm is stable. SIAM Journal of Computing, 2003,32(5):
1260−1279.
[18] Tardieu O, Wang H, Lin H. A work-stealing scheduler for X10’s task parallelism with suspension. ACM Sigplan Notices, 2012,
47(8):267−276.
[19] Task Parallel Library (TPL). 2017. https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/task-parallel-library-tpl
[20] Addison C, LaGrone J, Huang L, Chapman B. Openmp 3.0 tasking implementation in OpenUH. In: Proc. of the Open64 Workshop.
2009. 1−10.
[21] Robison A, Voss M, Kukanov A. Optimization via reflection on work stealing in TBB. In: Proc. of the IEEE Int’l Symp. on Parallel
and Distributed Processing. Miami: IEEE, 2008. 1−8.
[22] Chatterjee S, Grossman M, Sbrlea A, Sarkar V. Dynamic task parallelism with a GPU work-stealing runtime system. In: Proc. of
the 24th Int’l Workshop on Languages and Compilers for Parallel Computing. Berlin: Springer-Verlag, 2013. 8−10.
[23] Cédric A, Samuel T, Raymond N, Pierre-André W. StarPU: A unified platform for task scheduling on heterogeneous multicore
architectures. In Proc. of the Euro-Par 15th Int’l Conf. on Parallel Processing. LNCS 5704, Delft, 2009. 863−874.
[24] Tzeng S, Patney A, Owens JD. Task management for irregular parallel workloads on the GPU. In: Proc. of the ACM SIGGRAPH/
EUROGRAPHICS Conf. on High Performance Graphics. Saarbrücken: ACM, 2010. 29−37.
[25] Thread Building Blocks. 2021. https://www.threadingbuildingblocks.org/
[26] Copty N, Duran A, Hoeflinger J, Massaioli F, Massaioli F, Teruel X, Unnikrishnan P, Zhang G. The design of OpenMP tasks. IEEE
Trans. on Parallel & Distributed Systems, 2009,20(3):404−418.
[27] Fu HH, Liao JF, Yang JZ, et al. The Sunway TaihuLight supercomputer: System and applications. SCIENCE CHINA: Information
Sciences, 2016,59(7):1−16.
[28] Acar UA, Blelloch GE, Blumofe RD. The data locality of work stealing. Theory of Computing Systems, 2002,35(3):321−347.
[29] Hamidzadeh B, Lilja DJ. Dynamic scheduling strategies for shared-memory multiprocessors. In: Proc. of the Int’l Conf. on
Distributed Computing Systems. Hong Kong: IEEE, 1996. 208−215.
附中文参考文献:
[1] 王蕾,崔慧敏,陈莉,冯晓兵.任务并行编程模型研究与进展.软件学报,2013,24(1):77−90. http://www.jos.org.cn/1000-9825/4339.
htm [doi: 10.3724/SP.J.1001.2013.04339]