### Best practice for optimizing compilation of complex forms

68

views

1

Dear all,

for my application I need to build a quite complex form defined on a mixed FunctionSpace which consists of a n-times duplicate of scalar and vectorial FE with n being potentially quite large (up to 50) for instance. Here is a simplified example for n=10:

Generalized strain measures consisting of the field themselves or their gradient are defined and the form contains nxn cross-products of similar strain measures as follows:

I also implemented a variant in which the similar strain measures are collected into a global vector so that the cross-products can be performed using a simple vector, Matrix, vector dot product as follows:

My question is to know if there are some general best practice advice when building such complex forms, especially regarding their compilation. Indeed:

Thanks

for my application I need to build a quite complex form defined on a mixed FunctionSpace which consists of a n-times duplicate of scalar and vectorial FE with n being potentially quite large (up to 50) for instance. Here is a simplified example for n=10:

```
# simple 2D mesh
N = 50
mesh = UnitSquareMesh(N, N)
# FunctionSpace containing `nlayers` vectorial and scalar fields
nlayers = 10
U1e = VectorElement("CG", mesh.ufl_cell(), 2)
U2e = FiniteElement("CG", mesh.ufl_cell(), 1)
list_Ve = [U1e, U2e]*nlayers
V = FunctionSpace(mesh, MixedElement(list_Ve))
```

Generalized strain measures consisting of the field themselves or their gradient are defined and the form contains nxn cross-products of similar strain measures as follows:

```
u = Function(V)
u_ = TestFunction(V)
du = TrialFunction(V)
e = Constant(0.1)
def generalized_deformation(u):
Ug = split(u)
list_e1 = [as_vector([sym(grad(Ug[2*i]))[j, k] for j,k in [(0,0), (1,1), (0,1)]])
for i in range(nlayers)]
list_e2 = [Ug[2*i+3] - Ug[2*i+1] for i in range(nlayers - 1)]
list_e3 = [Ug[2*i+2] - Ug[2*i]
- e*(grad(Ug[2*i+1]) + grad(Ug[2*i+3]))
for i in range(nlayers - 1)]
return list_e1, list_e2, list_e3
def custom_inner(E1, E2):
a_list = []
for (e1, e2) in zip(E1, E2):
for el1 in e1:
for el2 in e2:
s = shape(el1)
if s:
m = s[0]
Ce = as_matrix([[1.]*m]*m)
a_list.append(dot(el1, dot(Ce, el2)))
else:
a_list.append(dot(el1, el2))
return sum(a_list)
dEps = generalized_deformation(du)
Eps_ = generalized_deformation(u_)
a = custom_inner(dEps, Eps_)*dx
tic = time()
jit(a)
print "Compilation time (split):", time()-tic
tic = time()
assemble(a)
print "Assembly time (split):", time()-tic
```

I also implemented a variant in which the similar strain measures are collected into a global vector so that the cross-products can be performed using a simple vector, Matrix, vector dot product as follows:

```
def generalized_deformation_vect(u):
Ug = split(u)
list_e1 = [sym(grad(Ug[2*i]))[j, k] for i in range(nlayers)
for j,k in [(0,0), (1,1), (0,1)]]
list_e2 = [Ug[2*i+3] - Ug[2*i+1] for i in range(nlayers - 1)]
list_e3 = [Ug[2*i+2][j] - Ug[2*i][j]
- e*(grad(Ug[2*i+1])[j] + grad(Ug[2*i+3])[j])
for i in range(nlayers - 1) for j in range(2)]
return as_vector(list_e1), as_vector(list_e2), as_vector(list_e3)
tic = time()
dEps = generalized_deformation_vect(du)
Eps_ = generalized_deformation_vect(u_)
a2 = []
for i in range(3):
n = len(dEps[i])
C = as_matrix([[1.]*n]*n)
a2.append(dot(dEps[i], dot(C, Eps_[i])))
a2 = sum(a2)*dx
tic = time()
jit(a2)
print "Compilation time (vector):", time()-tic
tic = time()
assemble(a2)
print "Assembly time (vector):", time()-tic
```

My question is to know if there are some general best practice advice when building such complex forms, especially regarding their compilation. Indeed:

- using uflacs: compilation of the first (split) approach took 40s while the second (vector) took 35s, assembly time took 14s
- using quadrature: compilation of the first (split) approach took 11s while the second (vector) took 6s, assembly time took 15s

`maximum recursion depth exceeded while calling a Python object`

, no problem was observed with the vector approach:- uflacs compiled in 475s, assembly in 95s
- quadrature compiled in 29s, assembly in 100s

Thanks

Community: FEniCS Project

Have you tried https://github.com/firedrakeproject/tsfc ?

written
25 days ago by
Nate

Yes but I got a JIT failure that I do not really understand...

written
23 days ago by
bleyerj

Please login to add an answer/comment or follow this question.