BirdGAN: Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles

October 2019

tl;dr: Learn to map 2D perspective image to BEV with GAN.

Overall impression

The performance of BirdGAN on 3D object detection has the SOTA. The AP_3D @ IoU=0.7 is ~60 for easy and ~40 for hard. This is much better than the ~10 for ForeSeE

One major drawback is the limited forward distance BirdGAN can handle. In the clipping case, the frontal depth is only about 10 to 15 meters.

Personally I feel GAN related architecture not reliable for production. The closest to production research so far is still pseudo-lidar++.

Key ideas

Technical details