IoUNet: Acquisition of Localization Confidence for Accurate Object Detection

November 2019

tl;dr: Regress a separate branch to estimate the quality of object detection in terms of IoU.

Overall impression

The vanilla version of IoU-Net (with the prediction of IoU and Precise RoI Pooling) is already better than baseline, most likely due to the regularization effect of the IoU branch.

The classification confidence indicates the category of a bbox but cannot be interpreted as the localization accuracy.

It generates better results than SoftNMS (decrease the scores of overlapping ones instead of eliminating the overlapped candidate, see Review of Soft NMS on Zhihu), and can be dropped in many object detection frameworks.

KL Loss and IoU Net are similar, but are different in implmentation. KL loss directly regresses mean and var from the same head, instead of a separate head for IoU prediction in IoU Net. Also Var Voting is one forward pass, not like the IoU Net’s iterative optimization.

Key ideas

Classification confidence tends to be over-confident and is bipolar. This is similar to the results in gaussian yolov3.

Technical details