In this paper, we developed a cycle-accurate system simulator using Gem5, integrating an AMD GPU simulator and the ROCm toolchain to emulate NPU environments effectively. This integration is crucial for understanding the interaction between CPU and NPU components in advanced computing systems. The simulator's reliability and accuracy were validated by developing and testing a Matrix Transpose kernel compiled with the ROCm toolchain, demonstrating its potential as a valuable tool for research and development in integrated CPU/NPU systems.
목차
Abstract I. INTRODUCTION II. SIMULATOR ARCHITECTURE III. NPU SIMULATION IV. EVALUATION V. CONCLUSION ACKNOWLEDGMENT REFERENCES
저자
Kyu Hyun Choi [ SoC Platform Research Center Korea Electronics Technology Institute ]
Corresponding Author
Seokhun Jeon [ SoC Platform Research Center Korea Electronics Technology Institute ]