Perceiving Humans: from Monocular 3D Localization to Social Distancing

September 2021

tl;dr: Improved version of Monoloco (monoloco++) and application in social distancing.

Overall impression

This paper builds upon the previous work of MonoLoco.

The low-dimensional representation of humans give it more generalization. It escapes the image domain and reduce the input dimensionality. This makes the skeleton-baesd network extremely fast to train (2 min on a single 1080Ti GPU card).

Monoloco++ beats mono3D baselines (such as SMOKE). They have roughly the same performance on easy cases, but much better performance in medium/hard cases. And monoloco++ has higher recall than SMOKE (39% –> 70%).

Key ideas

Technical details