Amodal Completion and Size Constancy in Natural Scenes

January 2020

tl;dr: Infer the horizon and veridical size of objects in 2D images, with amodal 2D object detectors.

Overall impression

The paper explores the capability of using NN to predict the whole physical extent of object in 2D bbox, even though the object may be occluded or truncated.

“Almost nothing is visible in its entirety, yet almost everything is perceived as a whole and complete.”

The study did not start with amodal object detector, but rather using modal object detector and the image context in the modal bbox to infer the whole extent of amodal bbox, by explicitly modeling occlusion patterns along with detections.

The hypothesis of amodal completion is that the amodal prediction task can be reliably addressed given just the image crrrespodonhg top the visible object region (seeing the left of a car is sufficient to unambiguously infer the fully extent without significantly leveraging context).

Three important cues for depth perception: familiar size and relative size and perspective position.

There is a strong assumption that all objects detected are on the ground, which may not be true, even in the sample images shown.

Key ideas

Technical details