MT-CNN: Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

September 2019

tl;dr: One of the most widely used method for face detection and face landmark regression.

Overall impression

The paper seems rather primitive compared to general object detection frameworks like faster rcnn. MTCNN is more like the original rcnn method.

However it is also enlightening that a very shallow CNN (O-Net) applied on top of cropped image patches can regress landmark accurately. Landmark regression given an object bbox may not require that large of a receptive field anyway.

The paper is largely inspired by Hua Gang’s paper cascnn: A Convolutional Neural Network Cascade for Face Detection.

Key ideas

Technical details