Dynamic Graph CNN for Learning on Point Clouds (EdgeConv, DGCNN)

Mar 2019

tl;dr: Extact semantic features from point cloud by iteratively performing convolution on a dynamically updated neighborhood.

Overall impression

This paper extends on the PointNet architecture. This paper addresses the same probelm that pointNet++ tried to solve: PointNet treats each point input independently, and there is no local neighborhood information used. Instead of using farthest point sampling, EdgeConv uses kNN.

Key ideas

\[x'_i = \sum_{j:(i, j)\in E} h_{\theta}(x_i, x_j)\]

Note that the sum sign is a placeholder and can be max operation. - If $h(x_i, x_j)=\theta_j x_j$, then this is conventional convolution in Euclidean space. - If $h(x_i, x_j) = x_i$, then this is point net. - In this paper $h(x_i, x_j) = h(x_i, x_j-x_i)$, which captures the global information (x_i) and the local information (x_j - x_i)

Technical details


The first two have similar intrinsic features (local, more similar when zoomed in), and the later two have similar extrinsic features (global, more similar when zoomed out).

If you were somehow shrunk down so that you could walk around “inside” the data set and could only see nearby data points, you might think of the two blobs in each data set as rooms, and the narrow neck as a hallway. If you walked from one room to the other, you might not notice whether or not it was curving. So as in the building example, you would have a hard time telling the difference between the two sets from “inside” of them. (source, source2)