You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have run into issues with the file locking where a long running test script was still running when the next job was fetched. Then both tests where running and the new one failed because it requires a certain port to be free which was blocked by the other test.
I believe the issue lies in the locking component. According to twisted's FilesystemLock documentation, a call to .lock() returns true if successful and false otherwise. The return code of this call is not checked in the locking component so if it fails, as in my case above, it silently fails and the system believes it has the lock.
When the second test script fails, it unlocks the lock file. Then, when the first test is complete, I get the following error, which clearly indicates that the lock was unlocked by someone else which should not happen.
Traceback (most recent call last):
File "/usr/local/bin/opensubmit-exec", line 11, in <module>
sys.exit(console_script())
File "/usr/local/lib/python3.6/dist-packages/opensubmitexec/cmdline.py", line 120, in console_script
download_and_run(config)
File "/usr/local/lib/python3.6/dist-packages/opensubmitexec/locking.py", line 45, in __exit__
self.flock.unlock()
File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 221, in unlock
"Lock %r not owned by this process" % (self.name,))
ValueError: Lock '/tmp/executor.lock' not owned by this process
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 67, in apport_excepthook
binary = os.path.realpath(os.path.join(os.getcwd(), sys.argv[0]))
FileNotFoundError: [Errno 2] No such file or directory
Original exception was:
Traceback (most recent call last):
File "/usr/local/bin/opensubmit-exec", line 11, in <module>
sys.exit(console_script())
File "/usr/local/lib/python3.6/dist-packages/opensubmitexec/cmdline.py", line 120, in console_script
download_and_run(config)
File "/usr/local/lib/python3.6/dist-packages/opensubmitexec/locking.py", line 45, in __exit__
self.flock.unlock()
File "/usr/lib/python3/dist-packages/twisted/python/lockfile.py", line 221, in unlock
"Lock %r not owned by this process" % (self.name,))
ValueError: Lock '/tmp/executor.lock' not owned by this process
Just as an addition, the code from the twisted lock() call, which shows that if the lock symlink exists, it immediately returns false.
This should be rather straightforward to implement and I'm willing to prepare a PR for this in the new year, if wanted. I guess there just needs to be an additional check if the lock actually was acquired.
The text was updated successfully, but these errors were encountered:
I have run into issues with the file locking where a long running test script was still running when the next job was fetched. Then both tests where running and the new one failed because it requires a certain port to be free which was blocked by the other test.
I believe the issue lies in the locking component. According to twisted's FilesystemLock documentation, a call to
.lock()
returns true if successful and false otherwise. The return code of this call is not checked in the locking component so if it fails, as in my case above, it silently fails and the system believes it has the lock.opensubmit/executor/opensubmitexec/locking.py
Lines 31 to 45 in 3600025
When the second test script fails, it unlocks the lock file. Then, when the first test is complete, I get the following error, which clearly indicates that the lock was unlocked by someone else which should not happen.
Just as an addition, the code from the twisted
lock()
call, which shows that if the lock symlink exists, it immediately returns false.This should be rather straightforward to implement and I'm willing to prepare a PR for this in the new year, if wanted. I guess there just needs to be an additional check if the lock actually was acquired.
The text was updated successfully, but these errors were encountered: