Page 372 - 《软件学报》2025年第10期
P. 372

杨乐 等: BIVM: 类脑计算编译框架及其原型研究                                                      4769


                 4
                 (Zhongguancun Laboratory, Beijing 100094, China)
                 Abstract:  Brain-inspired  computing  chips  of  various  architectures  are  emerging,  and  the  inference/training/learning  algorithms  of  spiking
                 neural  network  (SNN)  and  the  efficient  simulation  of  biological  neural  networks  have  become  research  hotspots.  Meanwhile,  efficiently
                 executing  applications  with  different  computation/memory-access  characteristics  on  various  chips  remains  a  significant  challenge,  which  is
                 crucial  for  establishing  a  robust  brain-inspired  computing  ecosystem.  The  success  of  the  general-purpose  computing  ecosystem  indicates
                 that  a  flexible,  scalable,  and  reusable  compiler  infrastructure  is  an  effective  solution  to  this  problem.  This  study  proposes  BIVM,  a
                 compilation  framework  for  brain-inspired  computing,  along  with  its  proof-of-concept  implementation.  Based  on  the  multi-level  intermediate
                 representation  (MLIR)  framework  of  domain  specific  architecture  (DSA),  multi-layer  IRs  customized  for  SNNs  are  designed,  including  an
                 SNN  dialect,  middle-layer  IRs  composed  mainly  of  MLIR’s  inherent  dialects,  and  the  underlying  IRs  for  various  target  chips.  To  address
                 challenges  such  as  the  large  architectural  differences  and  varying  granularity  of  hardware  primitives  in  brain-inspired  chips,  BIVM
                 leverages MLIR’s progressivity feature. This allows for the mixing of different abstraction levels and concepts (e.g. combining fine-grained
                 instructions  with  coarse-grained  computation  based  on  the  crossbar  structure  specific  to  certain  back-ends),  enabling  software  module  reuse
                 and  reducing  compiler  development  costs,  ultimately  leading  to  high  productivity.  In  addition,  the  framework  provides  flexibility  to
                 combine  various  levels  of  compilation  optimizations,  including  widely-used  SNN-specific  optimizations  (e.g.  exploring  computing  sparsity
                 and improving parallelism) and low-level optimizations tailored to different back-ends, ensuring performance portability. The current BIVM
                 prototype  supports  back-ends  such  as  general-purpose  processors  (control-flow  architecture),  SNN  accelerator  chips  (FPGAs)  with  a  hybrid
                 control-/data-flow architecture, and data-flow chip designs based on ReRAM (resistive random-access memory, a widely-used neuromorphic
                 device).  It  can  optimize  and  compile  deep  SNN  and  biological  neural  network  simulation  applications  into  executables  tailored  for  these
                 chips.  Comprehensive  testing  and  performance  comparisons  demonstrate  the  potential  of  this  compilation  framework  in  achieving  high
                 productivity, portability, and performance.
                 Key words:  brain-inspired computing; compilation framework; brain-inspired computing chip

                    类脑计算被认为是实现下一代人工智能的极具潜力的技术路线                      [1] , 是后摩尔时代计算机体系结构重大发展方
                 向之一  [2] , 对它的研究也促进了更通用的存算融合体系结构的发展                 [3] . 不少研究团队专注于各类新型类脑计算架
                 构与芯片的研发, 但是如何在研制的架构/芯片上高效运行各类类脑计算应用                         (包括偏向于完成      AI 任务的深度脉
                 冲神经网络, 简称     D-SNN, 以及侧重于生物脉冲神经网络模拟的仿真计算) 是一个难点; 另一方面, 研发脑启发计
                 算模型/算法的团队也试图在各类类脑芯片上完成高效计算——高效算力的提供及便捷使用对于算法的发展非常
                 关键, 这已经在深度学习的发展历程中得到证实               [4] .
                    类脑计算编译软件需在此间发挥关键的“桥梁”作用, 即将各类类脑计算应用转变为能高效驱动架构迥异的类
                 脑计算芯片的可执行形式. 这也利于将各领域研究力量联合起来                     [5] , 形成良好的研发生态, 比如将神经学领域的新
                 发现应用到    AI 领域, 或者将最新的     (类脑) 计算芯片广泛高效地应用起来, 帮助新算法快速进化与应用.
                    据我们所知, 目前面向各类类脑计算芯片的通用编译框架还是空白——类脑计算编译框架是一个不针对特定
                 硬件的编译基础设施        (compiler infrastructure), 通过灵活的多层次内部架构、中间表示以及接口, 可以扩展支持多
                 种多样的类脑计算应用以及类脑计算芯片.
                    这与现有的种种针对特定类脑芯片的编译软件是不同的, 后者能够在目标体系结构上获得良好性能, 但各软
                 件模块与层次间接口是面向目标芯片的, 使得软件模块的可重用性                      (reusability)、工具链间的互操作性 (interopera-
                 bility) 与可组合性  (composability) 受到限制, 而且开发新型芯片编译器的成本较高.
                    编译框架的服务对象主要是类脑芯片和类脑应用开发工具研发人员. 目前, 已经出现了不少面向应用/模型研
                                                             [7]
                                                     [6]
                 发人员的类脑计算应用开发工具, 如           BindsNET 、Norse 、SpykeTorch 、SpikingJelly 等. 理想情况下, 只要将
                                                                         [8]
                                                                                     [9]
                 类脑计算芯片的软硬件接口或功能库与编译框架的下层表示相适配, 就可以支持各类类脑应用; 而开发工具研发
                 人员对该编译框架进行前端适配也将有利于扩大开发工具的硬件支持范围, 从而在各类类脑计算应用与类脑芯片
                 间建起桥梁: 一个高层次的、硬件无关的神经网络应用程序只需编写一次, 就能被编译为不同芯片上的可执行程
                 序  (可移植性, portability), 并且具有较高的执行效率     (性能, performance), 同时该编译框架可以灵活扩展或复用, 以
                 支持多种类型的类脑芯片和编译优化技术               (生产力, productivity). 这一方法论的有效性已经在计算机发展历史上
                 得到了证明: 通用计算领域, 基于处理器开源编译框架                gcc/LLVM  的应用生态蓬勃发展, 极大地降低了新型处理器
   367   368   369   370   371   372   373   374   375   376   377