Page 55 - 《软件学报》2020年第9期
P. 55

2676                                 Journal of Software  软件学报 Vol.31, No.9,  September 2020

         [90]     Punniyamurthy  K, Boroujerdian B,  Gerstlauer  A.  GATSim: Abstract timing simulation  of GPUS.  In: Proc. of the Design,
             Automation & Test in Europe Conf. (DATE). IEEE, 2017. 43−48.
         [91]     GPGPU-Sim. http://www.gpgpu-sim.org/
         [92]     Bakhoda A, Yuan GL, Fung WWL, Wong H, Aamodt TM. Analyzing CUDA workloads using a detailed gpu simulator. In: Proc. of
             the IEEE Int’l Symp. on Performance Analysis of Systems and Software (ISPASS). IEEE, 2009. 163−174.
         [93]     Wang X, Zhang W. Cache locking vs. partitioning for real-time computing on integrated CPU-GPU processors. In: Proc. of the
             35th IEEE Int’l Performance Computing and Communications Conf. (IPCCC). IEEE, 2016. 1−8.
         [94]     Picchi J, Zhang W. Impact of l2 cache locking on GPU performance. In: Proc. of the SoutheastCon 2015. IEEE, 2015. 1−4.
         [95]     Huangfu Y, Zhang W. Warp-Based load/store reordering to improve GPU data cache time predictability and performance. In: Proc.
             of the 19th IEEE Int’l Symp. on Real-Time Distributed Computing (ISORC). IEEE, 2016. 166−173.
         [96]     Huangfu Y, Zhang W. Warp-Based load/store reordering to improve gpu time predictability. JCSE, 2017,11(2).
         [97]    Chen G, Guan N, Lü MS, Wang Y. State-of-the-Art survey of real-time multicore system. Ruan Jian Xue Bao/ Journal of Software,
             2018,29(7):2152−2176 (in  Chinese  with English  abstract). http://www.jos.org.cn/1000-9825/5580.htm [doi: 10.13328/j.cnki.jos.
             005580]
         [98]     Kato S, Lakshmanan K, Rajkumar R, Ishikawa Y. TimeGraph: GPU scheduling for real-time multi-tasking environments. In: Proc.
             of the 2011 USENIX Conf. on USENIX Annual Technical Conf. ACM, 2011. 2.
         [99]     Kato S, Lakshmanan K, Kumar A, Kelkar M, Ishikawa Y, Rajkumar R. RGEM: A responsive GPGPU execution model for runtime
             engines. In: Proc. of the 32nd Real-Time Systems Symp. (RTSS). IEEE, 2011. 57−66.
        [100]     Basaran C, Kang KD. Supporting preemptive task executions and memory copies in GPGPUS. In: Proc. of the 24th Euromicro
             Conf. on Real-Time Systems (ECRTS). IEEE, 2012. 287−296.
        [101]     Zhong J, He B. Kernelet: High-throughput gpu kernel executions with dynamic slicing and scheduling. IEEE Trans. on Parallel
             Distrib. Syst., 2014,25(6):1522−1532.
        [102]     Verner U, Schuster A, Silberstein M, Mendelson A. Scheduling processing of real-time data streams on heterogeneous multi-GPU
             systems. In: Proc. of the 5th Annual Int’l Systems and Storage Conf. (SYSTOR). ACM, 2012.
        [103]     Verner U, Mendelson A, Schuster A. Batch method for efficient resource sharing in real-time multi-GPU systems. In: Proc. of the
             15th Int’l Conf. on Distributed Computing and Networking (ICDCN). Springer-Verlag, 2014. 347−362.
        [104]     Verner U, Mendelson A, Schuster A. Scheduling periodic real-time communication in multi-GPU systems. In: Proc. of the 23rd
             Int’l Conf. on Computer Communication and Networks (ICCCN). IEEE, 2014. 1−8.
        [105]     Kim J, Andersson B, De Niz D, Rajkumar R. Segment-Fixed priority scheduling for self-suspending real-time tasks. In: Proc. of the
             34th Real-Time Systems Symp. (RTSS). IEEE, 2013. 246−257.
        [106]     Chen G, Zhao Y, Shen X, Zhou H. EffiSha: A software framework for enabling effficient preemptive scheduling of GPU. In: Proc.
             of the PPoPP. 2017. 3−16.
        [107]     Wang  J, Rubin N,  Sidelnik A, Yalamanchili  S. Dynamic  thread  block launch: A  lightweight execution mechanism to support
             irregular applications on GPUS. ACM SIGARCH Computer Architecture News, 2015,43(3):528−540.
        [108]    Hosseinimotlagh S, Kim H. Thermal-Aware servers for real-time tasks on multi-core GPU-integrated embedded systems. In: Proc.
             of the 25th IEEE Real-Time and Embedded Technology and Applications Symp. (RTAS). IEEE, 2019. 254−266.
        [109]     Nugteren C, Van den Braak GJ, Corporaal H, Bal HE. A detailed GPU cache model based on reuse distance theory. In: Proc. of the
             20th Int’l Symp. on High Performance Computer Architecture (HPCA). IEEE, 2014. 37−48.
        [110]     Liang Y, Li X. Efficient kernel management on GPUS. ACM Transactions on Computer Systems, 2017,16(4):115:1−115:24.
        [111]     Park JJK, Park Y, Mahlke S.  Dynamic resource  management  for  efficient utilization of  multitasking GPUS.  ACM SIGARCH
             Computer Architecture News, 2017,45(1):527−540.
        [112]     Elliott GA, Ward BC, Anderson JH. GPUSync: A framework for real-time GPU management. In: Proc. of the Real-Time Systems
             Symp. 2013. 33−44.
        [113]     Pellizzoni R, Betti E, Bak S, Yao G, Criswell J, Caccamo M, Kegley R. A predictable execution model for cots-based embedded
             systems. In: Proc. of the 17th Real-Time and Embedded Technology and Applications Symp. (RTAS). IEEE, 2011. 269−279.
        [114]    Alhammad A, Pellizzoni R. Time-Predictable execution of multithreaded applications on multicore systems. In: Proc. of the Design,
             Automation & Test in Europe Conf. (DATE). European Design and Automation Association, 2014. 1−6.
        [115]     Abdelouahab K, Pelcat M, Serot J, Berry F. Accelerating CNN inference on fpgas: A survey. 2018.
   50   51   52   53   54   55   56   57   58   59   60