Sharjeel Masood, Saeed Ahmad, Xufeng Hu, Changjoon Park, Namjung Kim, Jeonghwan Gwak
언어
영어(ENG)
URL
https://www.earticle.net/Article/A468795
원문정보
초록
영어
Artificial neural networks have been constantly increasing in size and complexity, so their resource demands have also increased. These high computational requirements and processing time make them impractical for real-life development scenarios involving embedded systems. Resource-constrained environments such as mobile devices, IoT gadgets and edge computing platforms demand efficient models with lower computational complexity and fast real-time inference speeds. We have developed an iterative pruning technique to reduce the inference time of the model by pruning less essential neurons. Unlike traditional pruning methods that require a separate pruning step after training, our technique prunes the network gradually as it learns. This method ensures the model adapts dynamically by removing unnecessary parameters while maintaining accuracy. Our technique works by temporarily reducing the weights of a few neurons and then studying how the networks resist those neurons. Neurons with high resistance are restored to their original state, while the others with low resistance are pruned.