Channel Pruning for Accelerating Very Deep Neural Networks

May 2019

tl;dr: Pruning filters by minimizing feature map reconstruction error.

Overall impression

This paper is highly related to pruning filters which uses L1 norm to prune filters. The paper is also highly influential in model compression field, with 224 citation as of 05/26/2019.

The paper demonstrated that Pruning Filters (max response) with L1 norm is sometimes worse than random pruning. It is argued that max response ignored correlation between different filters. Filters with large absolute weight may have strong correlation.

See LeGR for a more recent review and update of this work.

Key ideas

Technical details