Page 372 - 《软件学报》2025年第10期
P. 372
杨乐 等: BIVM: 类脑计算编译框架及其原型研究 4769
4
(Zhongguancun Laboratory, Beijing 100094, China)
Abstract: Brain-inspired computing chips of various architectures are emerging, and the inference/training/learning algorithms of spiking
neural network (SNN) and the efficient simulation of biological neural networks have become research hotspots. Meanwhile, efficiently
executing applications with different computation/memory-access characteristics on various chips remains a significant challenge, which is
crucial for establishing a robust brain-inspired computing ecosystem. The success of the general-purpose computing ecosystem indicates
that a flexible, scalable, and reusable compiler infrastructure is an effective solution to this problem. This study proposes BIVM, a
compilation framework for brain-inspired computing, along with its proof-of-concept implementation. Based on the multi-level intermediate
representation (MLIR) framework of domain specific architecture (DSA), multi-layer IRs customized for SNNs are designed, including an
SNN dialect, middle-layer IRs composed mainly of MLIR’s inherent dialects, and the underlying IRs for various target chips. To address
challenges such as the large architectural differences and varying granularity of hardware primitives in brain-inspired chips, BIVM
leverages MLIR’s progressivity feature. This allows for the mixing of different abstraction levels and concepts (e.g. combining fine-grained
instructions with coarse-grained computation based on the crossbar structure specific to certain back-ends), enabling software module reuse
and reducing compiler development costs, ultimately leading to high productivity. In addition, the framework provides flexibility to
combine various levels of compilation optimizations, including widely-used SNN-specific optimizations (e.g. exploring computing sparsity
and improving parallelism) and low-level optimizations tailored to different back-ends, ensuring performance portability. The current BIVM
prototype supports back-ends such as general-purpose processors (control-flow architecture), SNN accelerator chips (FPGAs) with a hybrid
control-/data-flow architecture, and data-flow chip designs based on ReRAM (resistive random-access memory, a widely-used neuromorphic
device). It can optimize and compile deep SNN and biological neural network simulation applications into executables tailored for these
chips. Comprehensive testing and performance comparisons demonstrate the potential of this compilation framework in achieving high
productivity, portability, and performance.
Key words: brain-inspired computing; compilation framework; brain-inspired computing chip
类脑计算被认为是实现下一代人工智能的极具潜力的技术路线 [1] , 是后摩尔时代计算机体系结构重大发展方
向之一 [2] , 对它的研究也促进了更通用的存算融合体系结构的发展 [3] . 不少研究团队专注于各类新型类脑计算架
构与芯片的研发, 但是如何在研制的架构/芯片上高效运行各类类脑计算应用 (包括偏向于完成 AI 任务的深度脉
冲神经网络, 简称 D-SNN, 以及侧重于生物脉冲神经网络模拟的仿真计算) 是一个难点; 另一方面, 研发脑启发计
算模型/算法的团队也试图在各类类脑芯片上完成高效计算——高效算力的提供及便捷使用对于算法的发展非常
关键, 这已经在深度学习的发展历程中得到证实 [4] .
类脑计算编译软件需在此间发挥关键的“桥梁”作用, 即将各类类脑计算应用转变为能高效驱动架构迥异的类
脑计算芯片的可执行形式. 这也利于将各领域研究力量联合起来 [5] , 形成良好的研发生态, 比如将神经学领域的新
发现应用到 AI 领域, 或者将最新的 (类脑) 计算芯片广泛高效地应用起来, 帮助新算法快速进化与应用.
据我们所知, 目前面向各类类脑计算芯片的通用编译框架还是空白——类脑计算编译框架是一个不针对特定
硬件的编译基础设施 (compiler infrastructure), 通过灵活的多层次内部架构、中间表示以及接口, 可以扩展支持多
种多样的类脑计算应用以及类脑计算芯片.
这与现有的种种针对特定类脑芯片的编译软件是不同的, 后者能够在目标体系结构上获得良好性能, 但各软
件模块与层次间接口是面向目标芯片的, 使得软件模块的可重用性 (reusability)、工具链间的互操作性 (interopera-
bility) 与可组合性 (composability) 受到限制, 而且开发新型芯片编译器的成本较高.
编译框架的服务对象主要是类脑芯片和类脑应用开发工具研发人员. 目前, 已经出现了不少面向应用/模型研
[7]
[6]
发人员的类脑计算应用开发工具, 如 BindsNET 、Norse 、SpykeTorch 、SpikingJelly 等. 理想情况下, 只要将
[8]
[9]
类脑计算芯片的软硬件接口或功能库与编译框架的下层表示相适配, 就可以支持各类类脑应用; 而开发工具研发
人员对该编译框架进行前端适配也将有利于扩大开发工具的硬件支持范围, 从而在各类类脑计算应用与类脑芯片
间建起桥梁: 一个高层次的、硬件无关的神经网络应用程序只需编写一次, 就能被编译为不同芯片上的可执行程
序 (可移植性, portability), 并且具有较高的执行效率 (性能, performance), 同时该编译框架可以灵活扩展或复用, 以
支持多种类型的类脑芯片和编译优化技术 (生产力, productivity). 这一方法论的有效性已经在计算机发展历史上
得到了证明: 通用计算领域, 基于处理器开源编译框架 gcc/LLVM 的应用生态蓬勃发展, 极大地降低了新型处理器

