SlowFast Networks for Video Recognition

Feb 2019

tl;dr: Understand video with two pathways, one slow pathway which understands the spatial information and one fast pathway which tracks the motion. This is biologically inspired by the P cells and M cells in retinal ganglion cells.

Overall impression

The paper is quite eye-opening, in particular with the analogy to the gangalion cells in the primate retinal system. The P cells are sensitive to color with high acuity, but slow. The M cells are colorblind with low acuity, but fast. The M cells has a larger receptive field (Yes receptive field is a medical term). I believe this is the future direction of video recognition.

Key ideas

Technical details