Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

Category
Network Pruning
Network Quantization
Year/Month
2015-10
Status
Publications
ICLR
Code

TL;DR

처음으로 Neural Network compression 제안.
  1. pruning
    1. 9~13X
  1. quantization
    1. 32bits to 5bits
  1. huffman coding
→ accuracy 영향없이 storage saving을 30~40X 늘림.
 
privacy, network bandwidth, storage, energy consumption 이점이 있음.

Motivation

Method

Experiments