### Convergence criterion in built-in Newton solver

98

views

1

Hi all,

I have looked at the source code of the built-in Newton solver on Bitbucket:

https://bitbucket.org/fenics-project/dolfin/src/6e04afacda3deefdf9fc41d42fd7738288538df1/dolfin/nls/NewtonSolver.cpp?at=master&fileviewer=file-view-default

I am confused by an aspect of the convergence criterion and I would like to get it right. If the user-set criterion is "residual", then the code computes the $\ell_2$ℓ

What I find confusing is that I would expect either the $\ell_{\infty}$ℓ

Why is this the best criterion?

Thanks in advance!

I have looked at the source code of the built-in Newton solver on Bitbucket:

https://bitbucket.org/fenics-project/dolfin/src/6e04afacda3deefdf9fc41d42fd7738288538df1/dolfin/nls/NewtonSolver.cpp?at=master&fileviewer=file-view-default

I am confused by an aspect of the convergence criterion and I would like to get it right. If the user-set criterion is "residual", then the code computes the $\ell_2$ℓ

_{2}norm of the weak residual _*vector_*and compares it against the tolerance.What I find confusing is that I would expect either the $\ell_{\infty}$ℓ

_{∞}norm (maximum absolute residual at vertices) or the $L_2$`L`_{2}norm (as in: integration over the whole domain) to be used. I can see that the choice of using the $\ell_2$ℓ_{2}norm of the vector might reflect that we only know the function value at the vertices - not elsewhere - but somehow every notion of where the points sit and their distances is lost this way.Why is this the best criterion?

Thanks in advance!

Community: FEniCS Project

### 1 Answer

5

Choices of norms and tolerances for algebraic solvers is something that I've spent some time wondering about myself. Here is my take on the matter:

I believe it is partly an arbitrary choice, and \(\ell^2\) is not really the "best criterion" in any precise sense. One thing to keep in mind is that, for a fixed mesh, the algebraic residual is in a fixed finite-dimensional space, so we have the result that any two norms for measuring it, say, \(\Vert\cdot\Vert\) and \(\vert\vert\vert\cdot\vert\vert\vert\), are "equivalent", in the sense that

\[\exists c_1, c_2~~\text{s.t.}~\forall u~~~c_1\Vert u\Vert \leq \vert\vert\vert u\vert\vert\vert \leq c_2\Vert u\Vert\text{ ,}\]

so convergence of the algebraic residual to zero in one norm implies convergence in any norm:

\[\vert\vert\vert u\vert\vert\vert \leq \epsilon\vert\vert\vert u_0\vert\vert\vert~~~\Rightarrow~~~\Vert u\Vert \leq \frac{c_2}{c_1}\epsilon\Vert u_0\Vert\text{ .}\]

However, this is not the full story, since \(c_1\) and \(c_2\) depend on the mesh, as does the extent to which the residual magnitude reflects functional error in the approximate solution, so the interpretation of \(\epsilon\) in the PDE setting will be mesh-dependent.

A more principled approach is to view the residual of the discretized problem as a functional, and measure it in the dual norm of the solution space, as discussed here

https://link.springer.com/article/10.1007/s100920170006

That reference discusses iterative solvers for linear algebraic problems coming from discretizing linear PDEs, although it addresses a similar question, namely, "What does algebraic convergence say about functional convergence?" In practice, though, I don't think these more elaborate criteria are widely used, because they make the convergence check more expensive, and are PDE-dependent, so developers would not be able to implement a generic, black-box linear or nonlinear solver once, for application to whatever algebraic problem a user might come up with.

I believe it is partly an arbitrary choice, and \(\ell^2\) is not really the "best criterion" in any precise sense. One thing to keep in mind is that, for a fixed mesh, the algebraic residual is in a fixed finite-dimensional space, so we have the result that any two norms for measuring it, say, \(\Vert\cdot\Vert\) and \(\vert\vert\vert\cdot\vert\vert\vert\), are "equivalent", in the sense that

\[\exists c_1, c_2~~\text{s.t.}~\forall u~~~c_1\Vert u\Vert \leq \vert\vert\vert u\vert\vert\vert \leq c_2\Vert u\Vert\text{ ,}\]

so convergence of the algebraic residual to zero in one norm implies convergence in any norm:

\[\vert\vert\vert u\vert\vert\vert \leq \epsilon\vert\vert\vert u_0\vert\vert\vert~~~\Rightarrow~~~\Vert u\Vert \leq \frac{c_2}{c_1}\epsilon\Vert u_0\Vert\text{ .}\]

However, this is not the full story, since \(c_1\) and \(c_2\) depend on the mesh, as does the extent to which the residual magnitude reflects functional error in the approximate solution, so the interpretation of \(\epsilon\) in the PDE setting will be mesh-dependent.

A more principled approach is to view the residual of the discretized problem as a functional, and measure it in the dual norm of the solution space, as discussed here

https://link.springer.com/article/10.1007/s100920170006

That reference discusses iterative solvers for linear algebraic problems coming from discretizing linear PDEs, although it addresses a similar question, namely, "What does algebraic convergence say about functional convergence?" In practice, though, I don't think these more elaborate criteria are widely used, because they make the convergence check more expensive, and are PDE-dependent, so developers would not be able to implement a generic, black-box linear or nonlinear solver once, for application to whatever algebraic problem a user might come up with.

Thanks a lot for this detailed and clear answer!

written
16 days ago by
scaramouche-00

Please login to add an answer/comment or follow this question.