Learning-Deep-Learning

Repulsion Loss: Detecting Pedestrians in a Crowd

October 2020

tl;dr: A novel bbox regression loss specifically designed for crowd scenes. This not only push each proposal to reach its designed target, but also keep it away from other surrounding objects.

Overall impression

The paper has a solid analysis into the difficulty of detection under crowd occlusion.

Two issues with crowd occlusion: 1) increases the difficulty of bbox localization, as it is hard to tell diff GT apart as regression target. 2) NMS is more sensitive to threshold (higher thresh brings more FP, and lower thresh leads to missed detections).

Thus the bbox regression with RepLoss is driven by two motivations: attraction by the target and repulsion by other surrounding target (GT) and proposals (pred).

Both RepLoss and AggLoss proposes additional penalties to produce more compact bounding boxes and become less sensitive to NMS. And also imposes additional penalties to bbox which appear in the middle of the two pedestrians.

Visualization before NMS seems to be a powerful debugging tool.

Key ideas

Technical details

Notes