Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BBPBGLIB-1139] Missing exception logging on configuration errors (#142)
## Context Sometimes, when there is an error (typo) in sonata config file, the run was just failing silently, without any exception logged. ## Scope There is a hack in commands.py, which synchronizes exception logging by allowing only a single node to log the exception to stderr (otherwise, production runs would be flooded by exception logs). It requires a file to write all MPI ranks for all failed runs, and choosing the first rank for reporting the exception. This temp file was supposed to be deleted on startup, but if the exception happens early enough (e.g. if there is a typo in the config file), the temp file removal code was not reached, and it messed with subsequent runs and their exception logging. This attempt at leader election is a hack, to say the least, but for now it should work with this fix. We might want to think how to implement this properly later; temp files are fragile and non-atomic. The code removing the temp file has been moved to earlier position in the execution, to commands.py. ## Testing It is hard to test for this in the multi-node environment. The change is minor. We might want to write a unit test that checks if this file is actually removed after starting commands.py, but does it make sense? ## Review * [X] PR description is complete * [X] Coding style (imports, function length, New functions, classes or files) are good * [?] Unit/Scientific test added * [X] Updated Readme, in-code, developer documentation
- Loading branch information