September 9, 2018

NCH: finding a metric for point removal

Introduction

The objective of my thesis is to find ways to remove points from the input point cloud to NCH, while managing to guarantee that the error incurred is not too big. Clearly, the last sentence is far from specific: I haven’t defined a metric, nor said what “too big” means.

Before we begin, let’s recap on what the definition of the NCH function looks like:

$$ f(x) = \max_{1 \leq i \leq N} f_i(x) $$

This defines the implicitly reconstructed surface $f(x) = 0$, where the $f_i(x)$ are basis functions, defined one per point in the cloud as:

$$ f_i(x) = \langle n_i, x - p_i \rangle - \rho_i ||x - p_i||^2 $$

$f(x)$ has a really good property: it is a Signed Distance Function; this means, according to Wikipedia:

The signed distance function (or oriented distance function) of a set $\Omega$ in a metric space determines the distance of a given point $x$ from the boundary of $\Omega$, with the sign determined by whether $x$ is in $\Omega$. The function has positive values at points $x$ inside $\Omega$, it decreases in value as $x$ approaches the boundary of $\Omega$ where the signed distance function is zero, and it takes negative values outside of $\Omega$

Thus, informally: $f(x) > 0$ when $x$ is inside the reconstructed surface, $f(x) = 0$ at its border, and $f(x) < 0$ outside of it.

What a metric should look like

An intuitive setting to use when analyzing a simplification algorithm would go something like this: given a finite set of oriented points $\mathcal{P} = \{p_1, …, p_N\}$ with normals $\{n_1, …, n_N \}$ and the associated NCH function $f(x)$ (and basis functions $f_i(x)$), we want to find $\mathcal{P}’ \subset \mathcal{P}$ such that its NCH function $g(x)$ (and basis functions $g_j(x)$) defines a surface that is similar to the one defined by $f$.

The problem here lies in how to measure the new surface’s “similarity to $f$”. There is a lot of reading to be done on this topic alone, of course. At first sight it seems that, ideally, one would compute the Hausdorff Distance between the two volumes, and be done with the problem.

In practice, this is hard to compute: you have to take a maximum over supremums and infimums over all points in the surface. The practical way to approach this problem involves extracting the surface from the level curve into a mesh, and then measuring distance between meshes. This is the approach that Metro takes, for example.

The methodology is problematic: you have to first transform the level curve into a mesh, which means running Marching Cubes or a similar algorithm; these have their own set of parameters and of course add error into the process of computing the metric.

A yet to be published paper on NCH simplification by my thesis advisor takes a different route, and looks at the value of $g$ at the points that were removed while simplifying the point cloud. This makes sense because $g$ is a signed distance function: $|g(p_h)|$, where $p_h$ is a removed point, gives a metric as to “how far away a point you knew was on the border of the surface is when you remove that point and reconstruct the surface again”. This, however, is basically a proxy to what we actually want to measure, which is the point’s distance to the border of the reconstructed surface.

Measuring distance to the isosurface

In this context, I started trying to find the distance from $p_h$ to the isosurface. First thing I did was to formulate the problem:

\begin{align} &\operatorname{minimize}& & || p - p_h || \\
&\operatorname{subject\;to} & &f(p) = 0 \end{align}

Being a computer scientist with a relatively low knowledge of math, I did what any other person would do: Google. This led me to Boyd and Vandenberghe’s great book on Convex Optimization, which

Attempt 1: Convex Optimization

Notice that $|| p - p_h ||$ is a convex function. However, $f(p)$ is a maximum over concave functions, which doesn’t tell you anything of worth for solving this. The key here is to look at the value of $f(p_h)$:

  1. If $f(p_h) = 0$, then even though we removed $p_h$, some other point ended up defining the same radius, and thus we don’t incur any error out of the simplification (i.e., we don’t need to account for this case).
  2. If $f(p_h) < 0$, then we can just look for $f(p) \geq 0$, because $p_h$ is outside of the surface, and thus looking for a point that is inside of it will always give a point at the border.
  3. If $f(p_h) > 0$, then the point is inside the surface, and thus we can look for $f(p) \leq 0$, under the same reasoning as above.

Handling $f(p_h) < 0$

In this case, the problem turns into

\begin{align} &\operatorname{minimize}& & || p - p_h || \\
&\operatorname{subject\;to} & & -f_i(p) \leq 0, \quad i = 1,\dots,N \end{align}

This is great, because now we have a convex cost function with convex constraints, and this is readily solvable using any convex optimization package.

I took a black-box approach to this and used cvxpy to model this with success, in roughly 20 lines of Python.

Handling $f(p_h) > 0$

This case is slightly more problematic than the previous one:

\begin{align} &\operatorname{minimize}& & || p - p_h || \\
&\operatorname{subject\;to} & & f_i(p) \leq 0, \quad i = 1,\dots,N \end{align}

However, $f_i(p)$ are all now concave constraints. I got stuck with this here, my understanding of convex optimization is lacking to say the least, so I couldn’t find any way to turn this into something I can use.

Attempt 2: Optimization with Derivatives

I found this post by Inigo Quilez, where he basically applies one round of Newton’s method as a way to find a lower bound for the distance of a point to the surface. There are a few problems here:

  • First and foremost, it assumes that the function is differentiable. This is not necessarily true in our case, as $f(x)$ is a maximum.
  • It also assumes that the first order Taylor approximation to $f(x)$ is accurate. For this to be true, the intersection point has got to be sufficiently close to the surface. This may not be true in our case: there is a priori no upper bound to how the surface may change due to a point removal.

If you plot $f(x)$ for a few examples and stare at it long enough, it starts to look like it is really well behaved. Thus, I set out to try to prove whatever I could in terms of continuity and differentiability.

$f(x)$ is differentiable almost everywhere

The first thing to notice is that all $f_i(x)$ are $\mathcal{C}^\infty$; there is nothing to prove here, as this is an elementary fact. Since these are all smooth functions, they are also locally Lipschitz. Now we can use Lemma 2.1 from these lectures to establish that $f(x)$ is locally Lipschitz (as it is a supremum over a finite set of functions). Finally, Rademacher’s theorem, proven in the same notes, tells us that $f(x)$ is almost-everywhere differentiable. Now we need to figure out what the derivative of $f(x)$ actually is.

Towards this purpose, we can define $J(x) = \{ i : f_i(x) = f(x) \}$, that is, the set of points that achieve the maximum for a given point. In principle, $J(x) \neq \phi$ for sure, as every point has at least one function that determines the maximum. It may be a singleton, or it may be of size up to $N$ (the number of points).

In the escenario in which $J(x) = \{ i \}$, if you write down the directional derivative at the point, then it will become the expression for the directional derivative of $f_i(x)$. This tells me that $\nabla f(x) = \nabla f_i(x)$. This is the case for most points, except for the boundaries where the value of $f(x)$ “changes hands”.

If we look at the case when there are many such points, then things become much harder. The only thing I was able to see clearly is that, if all $f_i(x)$ that determine the value of $f(x)$ have the same gradient, then that will be $f(x)$’s gradient. But this is far from useful in this context.

This was as far as I was able to get, but this also gave me a few ideas as to what to do next.

Lagrange Multipliers

Perhaps the most well known constrained optimization method in existence is that of Lagrange Multipliers. They also seemed to solve exactly the same problem I was looking to solve. So I set on my way to write the Lagrangian down for my problem:

$$\mathcal{L}(p, \lambda) = -||p - p_h||^2 - \lambda f(p) $$

This becomes weird really fast: since $p$ can move around, the value of $f(p)$ could be determined by any basis function, so we can’t really optimize it this way. There’s an unexplored alternative here, which is to do Lagrange Multipliers with multiple $\leq$ constraints; it just seemed too complicated.

Gradient Descent

Inspired by the deep learning hype that has been going around for the last few years, it seemed reasonable to frame this as optimizing a non-convex cost function such as this:

$$\mathcal{L}(p) = ||p - p_h||^2 + \lambda f(p)^2 $$

And this in fact works out to some degree, but the solution is still lacking. Depending on the $\lambda$ regularization you use (and learning rate), its possible to get a reasonably good approximation in as little as 4 gradient updates. This is all using as gradient for $f(p)$ the gradient of the basis that achieves the maximum, as established above.

Newton’s Method

TODO: New experiments with Newton.