ffc fails on MPI cluster (multiple nodes): Source file should not exist at this point!
12 weeks ago by
Moving new file over differing existing file: src: /tmp/tmp926dy6f7/ffc_form_bbc590299c63ad57e1103617fd94585c79ed63d7.cpp.gz dst: /home/username/.cache/dijitso/src/ffc_form_bbc590299c63ad57e1103617fd94585c79ed63d7.cpp.gz backup: /home/username/.cache/dijitso/src/ffc_form_bbc590299c63ad57e1103617fd94585c79ed63d7.cpp.gz.old Backup file exists, overwriting. Traceback (most recent call last): File "/usr/lib/python3/dist-packages/dolfin/compilemodules/jit.py", line 142, in jit result = ffc.jit(ufl_object, parameters=p) File "/usr/lib/python3/dist-packages/ffc/jitcompiler.py", line 218, in jit module = jit_build(ufl_object, module_name, parameters) File "/usr/lib/python3/dist-packages/ffc/jitcompiler.py", line 134, in jit_build generate=jit_generate) File "/usr/lib/python3/dist-packages/dijitso/jit.py", line 180, in jit params) File "/usr/lib/python3/dist-packages/dijitso/build.py", line 183, in build_shared_library lockfree_move_file(temp_src_filename, src_filename) File "/usr/lib/python3/dist-packages/dijitso/system.py", line 272, in lockfree_move_file return _lockfree_move_file(src, dst, False) File "/usr/lib/python3/dist-packages/dijitso/system.py", line 299, in _lockfree_move_file _lockfree_move_file(dst, backup, True) File "/usr/lib/python3/dist-packages/dijitso/system.py", line 338, in _lockfree_move_file raise RuntimeError("Source file should not exist at this point!") RuntimeError: Source file should not exist at this point!
The same code does succeed on 16 processes (1 node).
I suspect it is related to using an nfs4 filesystem for the directory placed at /data, done to give both nodes access to the same script. But on the other hand, the complaint is about /home/username/.cache/dijitso, which is specific to each node, not shared.
I have flufl.lock 2.4.1 installed, which is supposed to help with nfs though doesn't seem to have helped. What else should I do to get the code running?
Community: FEniCS Project
Please login to add an answer/comment or follow this question.