Page 40 - 《软件学报》2025年第9期
P. 40
李奕瑾 等: 基于 RISC-V VLIW 架构的混合指令调度算法 3951
论模型的数据流分析中, 无法区分图 15(a) 和图 15(b) 两种情况. 因此面对图 15(a) 前驱路径相互存在资源依赖的
情况, 理论模型无法得到 IPC 上确界.
0: A 2: A 0: A 2: L
→ → → →
d=1 d=1 d=1 d=1
1: L 3: L 1: L 3: A
→ → → →
d=1 d=1 d=1 d=1
4: A 4: A
(a) 理论模型未达上确界 (b) 理论模型可达上确界
图 15 理论模型未达/可达 IPC 上确界示例
7 总 结
本文设计了基于 RISC-V 指令集的可变长 VLIW 架构, 并提出了针对单个调度区域的 IPC 理论模型指导的混
合指令调度算法. 混合调度通过 IPC 理论模型定位表调度可能没有得到调度最优解的调度区域, 再对该调度区域
进一步实施规划调度. 混合调度的核心在于 IPC 理论模型的准确性, 本文的 IPC 理论模型准确率为 95.74%, F1 值
为 97.79%. 本文提出的 IPC 理论模型能够认定 94.62% 的调度区域在表调度下已达最优解, 因此仅有 5.38% 的调
度区域需再进行规划调度. 因此该混合指令调度算法能够以接近表调度的复杂度达到规划调度的调度解.
References:
[1] Waterman A, Asanovic K. The RISC-V Instruction Set Manual Volume I: Unprivileged IAS. Document Version 20191213. 2024. https://
riscv.org/wp-content/uploads/2019/12/riscv-spec-20191213.pdf
[2] Bao YG, Sun NH. Opportunities and challenges of building CPU ecosystem with open-source mode. Bulletin of Chinese Academy of
Sciences, 2022, 37(1): 24–29 (in Chinese with English abstract). [doi: 10.16418/j.issn.1000-3045.20211117003]
[3] Celio C, Patterson DA, Asanović K. The Berkeley out-of-order machine (BOOM): An industry-competitive, synthesizable, parameterized
RISC-V processor. Berkeley: University of California, 2015.
[4] Li XL, Han M, Hao K, Xue HY, Lu SJ, Zhang KM, Qi N, Niu XM, Xiao LM, Hao QF. Design of RISC-V CPU for 100 Gbps network
application. Journal of Computer-aided Design & Computer Graphics, 2021, 33(6): 956–962 (in Chinese with English abstract). [doi: 10.
3724/SP.J.1089.2021.18538]
[5] Hu ZB. Teach You to Design CPU by hand. RISC-V Processor. Beijing: Posts and Telecommunications Press, 2018 (in Chinese).
[6] Tine B, Yalamarthy KP, Elsabbagh F, Hyesoon K. Vortex: Extending the RISC-V ISA for GPGPU and 3D-graphics. In: Proc. of the 54th
Annual IEEE/ACM Int’l Symp. on Microarchitecture (MICRO-54). ACM, 2021. 754–766. [doi: 10.1145/3466752.3480128]
[7] Fisher JA. Very long instruction word architectures and the ELI-512. In: Proc. of the 10th Annual Int’l Symp. on Computer Architecture.
Stockholm: ACM, 1983. 140–150. [doi: 10.1145/800046.801649]
[8] Qualcomm. The Qualcomm Hexagon SDK. 2024. https://www.qualcomm.com/developer/software/hexagon-npu-sdk
[9] Texas Instruments. TMS320C64x Technical Overview. 2024. https://www.ti.com/lit/ug/spru395b/spru395b.pdf
[10] Wang XQ, Hong Y, Wang H, Zheng QL. Compiler design and optimization for BWDSP. Acta Electronica Sinica, 2015, 43(8):
1656–1661 (in Chinese with English abstract). [doi: 10.3969/j.issn.0372-2112.2015.08.028]
[11] Michael Larabel. Kalray VLIW processor family (KVX). 2024. https://www.phoronix.com/news/Kalray-KVX-Linux-Port
[12] Qui NM, Lin CH, Chen P. Design and implementation of a 256-bit RISC-V-based dynamically scheduled very long instruction word on
FPGA. IEEE Access, 2020, 8: 172996–173007. [doi: 10.1109/ACCESS.2020.3024851]
[13] Hennessy JL, Gross T. Postpass code optimization of pipeline constraints. ACM Trans. on Programming Languages & Systems, 1983,
5(3): 422–448. [doi: 10.1145/2166.357217]
[14] Bernstein D, Gertner I. Scheduling expressions on a pipelined processor with a maximal delay of one cycle. ACM Trans. on

