The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks


May 2019

tl;dr: The author states the lottery ticket hypothesis: A randomly-initialized, dense neural network contains a sub-network that is initialized such that – when trained in isolation – it can match the est accuracy of the original network after training for at most the same number of iterations. This could be a good approach to compress the NN without harming performance too much.

Overall impression

By using the pruning method of this paper, the winning tickets(pruned network) are 10-20% (or less) of the size of the original network. Down to that size, those networks meet or exceed the original network’s test accuracy in at most the same number of iterations.

Key ideas

Technical details