catching solver failures as a python exception

13 months ago by
I've found that if a nonlinear dolfin solver fails, for instance by exceeding the maximum number of iterations if convergence criteria are too tight, then the program stops completely.  It would be better if the failure could be caught as an exception, which would, for instance, allow convergence criteria to be adjusted, or in a multijob application could allow the failed calculation to be skipped while the program proceeds with the next job. 

I couldn't find any python exception that deals with solver failure (I'm working with fenics 2016.2.0). Putting the solve command in a try-catch block didn't catch anything, didn't keep the program from crashing.  Is there really no mechanism for this, or have I missed a detail?

1 Answer

13 months ago by
If the solver fails, it is because the system of equations is not well posed.  Typically, you need to evaluate the system at the point of failure to diagnose the problem and derive non-trival solutions.  However, you can set the solver parameter ``error_on_nonconvergence`` to false.  On return from the solve function, you will be given a boolean value indicating convergence.  With this, you can automate corrections to your code.

For example, an automated relaxation-parameter-decrease solver for a time-dependent non-linear problem :

params      = {'newton_solver' :
                  'linear_solver'           : 'mumps',
                  'absolute_tolerance'      : 1e-14,
                  'relative_tolerance'      : 1e-9,
                  'relaxation_parameter'    : 1.0,
                  'maximum_iterations'      : 20,
                  'error_on_nonconvergence' : False
ffc_options = {"optimize"               : True}

problem = NonlinearVariationalProblem(delta, U, J=J, bcs=bcs,
solver  = NonlinearVariationalSolver(problem)

adaptive = True

# loop over all times :
for t in times:

  # set the previous solution to the last iteration :

  # Compute solution
  if not adaptive:

  # solve equation, lower alpha on failure :
  elif adaptive:
    solved_u = False
    par    = params['newton_solver']
    while not solved_u:
      if par['relaxation_parameter'] < 0.5:
        status_u = [False, False]
      U_temp   = U.copy(True)
      U1_temp  = U1.copy(True)
      status_u = solver.solve()
      solved_u = status_u[1]
      if not solved_u:
        par['relaxation_parameter'] /= 1.4
        s = ">>> WARNING: newton relaxation parameter lowered to %g <<<"
        print_text(s % par['relaxation_parameter'], 'red', 1)
Thanks Evan.  That error_on_nonconvergence parameter will be helpful.
written 13 months ago by Drew Parsons 
Thanks Evan again!
I wonder if one can set the exit from the iterations cycle when the "nan" in the tolerance is reached first. Actually the exit occurs when the maximum number of iterations is reached, even when for most of the iterations the "nan" is calculated for the tolerance.
written 6 months ago by Michele Scaraggi 
That sounds like a useful idea to me, Michele. The convergence in residuals is checked in NewtonSolver::converged (dolfin/nls/NewtonSolver.cpp). The values of relative_residual and  _residual could be checked before comparing with rtol and atol, and a dolfin_error invoked if they're hitting nan.
written 3 months ago by Drew Parsons 
Or not quite so simple.  dolfin_error halts the program. The error_on_nonconvergence parameter causes a warning to be issued rather than a dolfin_error. So if we trigger a dolfin_error on NaN, the program will stop which is not what we want.  If we issue a warning, the warning will display but it will not stop iterations.  It would be the iteration logic in NewtonSolver.cpp that needs to be altered.
written 3 months ago by Drew Parsons 
Please login to add an answer/comment or follow this question.

Similar posts:
Search »