Page 124 - 《软件学报》2020年第10期
P. 124

软件学报 ISSN 1000-9825, CODEN RUXUEW                                        E-mail: jos@iscas.ac.cn
         Journal of Software,2020,31(10):3100–3119 [doi: 10.13328/j.cnki.jos.006066]   http://www.jos.org.cn
         ©中国科学院软件研究所版权所有.                                                          Tel: +86-10-62562563


                                                                  ∗
         在离线混部作业调度与资源管理技术研究综述

                      3
               1
                              1,2
         王康瑾 ,   贾   统 ,   李   影
         1 (北京大学  软件与微电子学院,北京  102600)
         2 (北京大学  软件工程国家工程研究中心,北京  100871)
         3
          (北京大学  信息科学技术学院,北京  100871)
         通信作者:  李影, E-mail: li.ying@pku.edu.cn

         摘   要:  数据中心是重要的信息基础设施,也是企业互联网应用的关键支撑.然而,目前数据中心的服务器资源利
         用率较低(仅为 10%~20%),导致大量的资源浪费,带来了极大的额外运维成本,成为制约各大企业提升计算效能的关
         键问题.混部(colocation),即将在线作业与离线作业混合部署,以空闲的在线集群资源满足离线作业的计算需求,作
         为一种重要的技术手段,混部能够有效提升数据中心资源利用率,成为当今学术界和产业界的研究热点.分析了在线
         作业与离线作业的特征,探讨了在离线作业间性能干扰等混部所面临的技术挑战,从性能干扰模型、作业调度、资
         源隔离与资源动态分配等方面就在离线混部技术进行了综述,并以业界典型混部管理系统为例探讨了在离线混部
         关键技术在产业界的应用及其效果,最后对未来的研究方向进行了展望.
         关键词:  数据中心;资源利用率;调度算法;资源管理技术;性能干扰
         中图法分类号: TP311
         中文引用格式:  王康瑾,贾统,李影.在离线混部作业调度与资源管理技术研究综述.软件学报,2020,31(10):3100–3119. http://
         www.jos.org.cn/1000-9825/6066.htm
         英文引用格式: Wang KJ, Jia T, Li Y. State-of-the-art survey of scheduling and resource management technology for colocation jobs.
         Ruan Jian Xue Bao/Journal of Software, 2020,31(10):3100–3119 (in Chinese). http://www.jos.org.cn/1000-9825/6066.htm

         State-of-the-art Survey of Scheduling and Resource Management Technology for Colocation
         Jobs
                                3
                      1
         WANG Kang-Jin ,  JIA Tong ,  LI Ying 1,2
         1 (School of Software and Microelectronics, Peking University, Beijing 102600, China)
         2 (National Engineering Research Center for Software Engineering, Peking University, Beijing 100871, China)
         3 (School of Electronics and Computer Science, Peking University, Beijing 100871, China)
         Abstract:    Data center is not only an important IT infrastructure, but also a key support for enterprise Internet application. However, the
         resource utilization of data center is pretty low (only 10%~20%), which leads to a large amount of waste of resources, brings a huge extra
         operation and maintenance cost, and becomes a key problem restricting enterprises to improve the computing efficiency. By colocating
         online services  and offline tasks,  colocation  can  effectively improve the  resource utilization rate of data  center,  which has become a
         research hotspot in academia and industry. This paper analyzes the characteristics of online services and offline tasks, and discusses the
         technical challenges faced by the performance interference between services and jobs. This paper summarizes the key technologies from
         the  aspects of performance interference  model,  job scheduling, resource  isolation  and dynamic resource  allocation,  and discusses the
         application and effect  of colocation systems in  the  industry with  four typical colocation  system. At  the end  of  this  paper,  the future
         research direction is presented.
         Key words:    Internet datacenter; resource utilization; job scheduling; resource management technology; performance interference

            ∗ 基金项目:  广东省重点领域研发计划(2020B010164003)
             Foundation item: Key-area Research and Development Program of Guangdong Province, China (2020B010164003)
             本文由“系统软件前沿进展”专题特约编辑武延军研究员、陈海波教授、包云岗研究员、李玲研究员推荐.
             收稿时间: 2020-02-10;  修改时间: 2020-04-04;  采用时间: 2020-05-09; jos 在线出版时间: 2020-06-10
   119   120   121   122   123   124   125   126   127   128   129