Learning-Deep-Learning

ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings

June 2020

tl;dr: General dynamic slam that can detect and track 3D MOD.

Overall impression

The paper is similar in function to Cube SLAM that tracks objects and us it to increase the robustness of SLAM, and also is able to handle dynamic scenes. It is more flexible in the sense that the detection is based on point cloud of landmarks on the cluster. In comparison, Cube SLAM models each object as a 3d cuboid, and QuadricSLAM models each object as an ellipsoid.

The paper is based on a stereo system but the performance of 3DOD is even worse than mono3D. This is due to the way clusterVO generates 3D bounding box through tracked cluster, which may not subtend the whole size of the object.

Terminology:

Key ideas

Technical details

Notes