Earticle

다운로드

임베디드 시스템에서의 INT8 양자화 인식 학습을 통한 비용효율적인 합성곱 신경망 구현
Implementing Cost-Effective CNNs through INT8 Quantization Aware Training on Embedded Systems

원문정보

초록

영어
The rising popularity of intelligent embedded systems, coupled with the substantial computational and memory requirements of convolutional neural networks (CNNs), necessitates cost-effective on-device model inference. Various post-optimization techniques are used to reduce the model size and precision bits. However, these techniques often result in a significant reduction in performance. To solve these issues, we propose a quantization-aware training (QAT) strategy for optimizing the CNNs to low-bit integers, resulting in faster inference and less memory utilization. We inject fake quantization modules into the original architecture, train the model in complete precision, and then convert the model to an 8-bit integer (INT8). The resultant QAT model performs all the computation of the convolution layers, activation layers, and batch-normalization in INT8. Our method reduces the size of ResNet50 and ResNet101 by a factor of 3.9x and improves the inference speed by more than 2x. We utilize the CIFAR-10 and CIFAR-100 datasets to test the performance of the models.

목차

Abstract
1. Introduction
2. Methods
2.1. Dataset
2.2. Experiment Setup
3. Experiment result
3.1. Analysis and Future Refinement
4. Conclusions
Acknowledgement
References

저자

  • Saeed Ahmad [ Dept. of Software Korea National University of Transportation Chungju-si, Republic of Korea ]
  • Sharjeel Masood [ Dept. of IT·Energy Convergence Korea National University of Transportation Chungju-si, Republic of Korea ]
  • Xufeng Hu [ Dept. of IT·Energy Convergence Korea National University of Transportation ]
  • Namjung Kim [ Dept. of Software Korea National University of Transportation Chungju-si, Republic of Korea ]
  • Changjoon Park [ Dept. of IT·Energy Convergence Korea National University of Transportation Chungju-si, Republic of Korea ]
  • Jeonghwan Gwak [ Dept. of Computer Software Korea National University of Transportation Chungju-si, Republic of Korea ] Correspondence author

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      한국차세대컴퓨팅학회 학술대회
    • 간기
      반년간
    • 수록기간
      2021~2025
    • 십진분류
      KDC 566 DDC 004