The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes

November 2020

tl;dr: VoxelNet + UKF for 3D detection and tracking in crowded urban scene.

Overall impression

H3D dataset includes 160 scenes, and 30k frames, at 2 Hz. Roughly 90 seconds each scene.

Really crowded scenes as H3D has roughly same number of people and vehicle.

Key ideas

Technical details