Double Anchor R-CNN for Human Detection in a Crowd

October 2020

tl;dr: Double Anchor RPN is developed to capture body and head parts in pairs.

Overall impression

Crowd occlusion is challenging for two reasons:

The intuition behind the paper is simple: compared with the human body, the head usually has a smaller scale, less overlap and a better view in real-world images, and thus is more robust to pose variations and crowd occlusions. –> this has very similar motivation to R2 NMS and VG NMS.

One main challenge in crowd detection is high score false positives. –> However safety-wise this does not seem to be an issue for autonomous driving.

Key ideas

Technical details