How well BoxMesh with refine performs well in parallel

5 weeks ago by
I am trying to analyze parallel mesh generation by BoxMesh with refine. For that I made one 3D elastic cube model of size 62x63x83 with 2 refinement which makes a problem size of ~48 millions dof.

My observations:
mesh building:
with 96 cores, mesh building total wall time = ~13 sec
with 384 cores, mesh building total wall time = ~28 sec
Linear solve(cg and gamg):
with 96 cores, solve total wall time = ~45 sec
with 384 cores, solve building total wall time = ~30 sec

Here solve time looks reasonably good assuming not a big enough model to scale properly. But on the other way the mesh generation time in rather increasing then decreasing. Based on my understanding refine and parallel mesh partitioning from SCOTCH is good enough to keep the mesh generation scalable. Am I missing something silly here or the results looks acceptable. As I see in figshare picture of weak scalability in ARCHER the mesh building is scaling reasonably good and the time for 96core and 384core in almost same. In my case 384 cores has smaller problem size of 48 millions than the 192 millions in ARCHER weak scalability study. So I think ideally mesh building in 384 cores should take less time than in 96 cores.

Some additional information; I am using optimized MPI & MKL libraries and all others possible things to make my production build highly scalable.

Please share your thoughts...
Community: FEniCS Project
Please login to add an answer/comment or follow this question.

Similar posts:
Search »