SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

May 2023

tl;dr: Dense annotation generation for 3D Occupancy Prediction.

Overall impression

Occupancy grid can describe real-world objects of arbitrary shapes and infinite classes. SurroundOcc proposed a pipeline to generate dense occupancy GT without expansive occupancy annotation by human labelers. The paper also demonstrate very clearly that with denser label, previous method (TPVFormer)’s performance can be significantly boosted, almost by 3x. This is the largest contribution of this paper.

The paper is from the same group of SurroundDepth, which performs bottom-up depth estimation from the standpoint of the source. In comparison, SurroundOcc performs occupancy prediction from the standpoint of the target. This relationship is quite similar to that between Lift-Splat-Shoot and BEVFormer.

The pipeline to generate occupancy label is Poisson Recon used by SurroundDepth, and Augment-and-Purify (AAP) pipeline proposed by OpenOccupancy. They share the initial steps but are different in the refinement step. It would be interesting to see a side-by-side comparison of the two.

Key ideas

Technical details