KL Loss: Bounding Box Regression with Uncertainty for Accurate Object Detection

November 2019

tl;dr: Predict bbox as two corners with mean and variance.

Overall impression

Similar to IoU Net, classification confidence is not always strongly related to localization confidence.

The paper models KL divergence loss between a Dirac delta label and a Gaussian prediction. Essentially this is NLL loss. For a more generalized KL loss, see LaserNet KL.

Variance voting is quite interesting idea and can be used even without variance scores (just down-weigh by IoU and weigh by confidence score). I am quite surprised this has not been tried before.

Learning localization confidence in addition to classification confidence can 1) give interpretable results 2) leads to more precise localization (AP90).

KL Loss and IoU Net are similar, but are different in implmentation. KL loss directly regresses mean and var from the same head, instead of a separate head for IoU prediction in IoU Net. Also Var Voting is one forward pass, not like the IoU Net’s iterative optimization.

Key ideas

Technical details