Evolutionary multi-objective model compression for deep neural networks

Z. Wang, T. Luo, Miqing Li, J.T. Zhou, R.S.M Goh, Liangli Zhen

Research output: Contribution to journalArticlepeer-review

93 Downloads (Pure)

Abstract

While deep neural networks (DNNs) deliver state-of-the-art accuracy on various applications from face recognition to language translation, it comes at the cost of high computational and space complexity, hindering their deployment on edge devices. To enable efficient processing of DNNs in inference, a novel approach, called Evolutionary Multi-Objective Model Compression (EMOMC), is proposed to optimize energy efficiency (or model size) and accuracy simultaneously. Specifically, the network pruning and quantization space are explored and exploited by using architecture population evolution. Furthermore, by taking advantage of the orthogonality between pruning and quantization, a two-stage pruning and quantization co-optimization strategy is developed, which considerably reduces time cost of the architecture search. Lastly, different dataflow designs and parameter coding schemes are considered in the optimization process since they have a significant impact on energy consumption and the model size. Owing to the cooperation of the evolution between different architectures in the population, a set of compact DNNs that offer trade-offs on different objectives (e.g., accuracy, energy efficiency and model size) can be obtained in a single run. Unlike most existing approaches designed to reduce the size of weight parameters with no significant loss of accuracy, the proposed method aims to achieve a trade-off between desirable objectives, for meeting different requirements of various edge devices. Experimental results demonstrate that the proposed approach can obtain a diverse population of compact DNNs that are suitable for a broad range of different memory usage and energy consumption requirements. Under negligible accuracy loss, EMOMC improves the energy efficiency and model compression rate of VGG-16 on CIFAR-10 by a factor of more than 8 9. X and 2.4 X, respectively.
Original languageEnglish
Article number9492169
Pages (from-to)10-21
Number of pages12
JournalIEEE Computational Intelligence Magazine
Volume16
Issue number3
Early online date21 Jul 2021
DOIs
Publication statusPublished - Aug 2021

Bibliographical note

Funding Information:
This work is partly supported by the Agency for Science,Technology and Research (A*STAR) under its AME Programmatic Funding Scheme (No.A18A1b0045 and No.A1687b0033).

Publisher Copyright:
© 2005-2012 IEEE.

Keywords

  • Deep learning
  • Energy consumption
  • Quantization (signal)
  • Computational modeling
  • Sociology
  • Memory management
  • Energy efficiency
  • Deep neural networks

ASJC Scopus subject areas

  • Artificial Intelligence
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Evolutionary multi-objective model compression for deep neural networks'. Together they form a unique fingerprint.

Cite this