Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: #7030 use temp-dir instead of data-dir for lock files #7031

Closed
wants to merge 1 commit into from

Conversation

guidoschreuder
Copy link
Contributor

Fixes Issue

#7030

Description of Change

Use temp directory for lockfile instead of data directory

Have test cases been added to cover the new functionality?

no

@boring-cyborg boring-cyborg bot added the core changes to core label Oct 10, 2024
@guidoschreuder guidoschreuder changed the title fix #7030: use temp-dir instead of data-dir for lock files fix: #7030 use temp-dir instead of data-dir for lock files Oct 10, 2024
@aikebah
Copy link
Collaborator

aikebah commented Oct 10, 2024

tempDirectory = FileUtils.createTempDirectory(baseTemp);

Your 'fix' is breaking the exclusive locking meant to ensure that 'during update' the files will not be used by another parallel execution of ODC that uses the same datadirectory.

Tempdirectory will be a unique folder per ODC process.

@guidoschreuder guidoschreuder marked this pull request as draft October 11, 2024 07:14
@guidoschreuder
Copy link
Contributor Author

@aikebah thanks for the response, i've converted the PR to draft, since it was kind of low effort, end-of-day and couldn't get the tests running properly locally

But, i see, would it be acceptable to use system temp directory for lock files?

@guidoschreuder
Copy link
Contributor Author

OK, the tests do look better now.

The once that are failing are also failing for me locally on a clean checkout of this repo, so i'm not sure what to do with that

@guidoschreuder guidoschreuder marked this pull request as ready for review October 11, 2024 07:54
@aikebah
Copy link
Collaborator

aikebah commented Oct 11, 2024

@guidoschreuder system temp folder for the locks would, for the same reasons, typically only work for single system CI environments. Containerized with volume-mounted datafolder and multiserver with network-mounted datafolder would also need the lock folder to be shared with the same granularity so that all instances that share the same datafolder also share the same locks folder.

The only change I can imagine to support your setup while retaining the intended locking semantics is to add a configuration possibility for a separate data-locks folder (that would need to be shared between builds in the exact same granularity as the datafolder itself and that would typically default, if not configured explicitly, to the datafolder itself)

@guidoschreuder
Copy link
Contributor Author

@aikebah thanks for the clarification, i totally get your point

that solution would work for me, so, i could continue on that if that would be accepted

(just for background) we really would never experience any problems without locking, what we do is, roughly:

  • we have a single, exclusive, task that runs periodically
  • we maintain multiple versions of the data directory e.g. /owasp_cache/d1, /owasp_cache/d2, etc.
  • before updating we copy the latest directory to a new directory and then run update over that
  • when successful we ln -snf /owasp_cache/cache /owasp_cache/d3
  • and then rotate old directories away

@aikebah
Copy link
Collaborator

aikebah commented Oct 11, 2024

When I spotted your issue in the tracker an approach along those lines would be what I had in mind to resolve it, so if you code a PR along those lines I would be happy to accept it.

Be aware though that there are also the various dynamic caches (unless you disable their caching): for centralAnalyzer, NodeAudit and the OSSIndexAnalyzer.

Depending on the frequency of your builds and the amount of overlap in dependencies across projects you might have a benefit by hosting those on a shared, writeable volume/mount to have the builds profit from a lower number of dependencies to lookup at the remote APIs for those.

These dynamic caches are currently stored (if writeable, otherwise caching is automatically disabled) inside two subfolders (oss_cache and cache) underneath the data folder; I think these locations are also not configurable at the moment so if the caching would be valuable to you you might want to extend configurability to also allow these two cache folders to point to (shared across your builds) writeable locations outside of your read-only mounted data folder.

@jeremylong
Copy link
Owner

Closing the PR as a better approach would be figuring out why a lock is generated when auto-update is disabled.

@jeremylong jeremylong closed this Oct 26, 2024
@aikebah
Copy link
Collaborator

aikebah commented Oct 26, 2024

@jeremylong lock is taken by the engine to guard that the copy-to-a-temp-folder of the H2 database file for readOnly access copies from a stable and consistent state -

public void openDatabase(boolean readOnly, boolean lockRequired) throws DatabaseException {

The strategy to copy over to a temp location for read-only acces dates back all the way to version 3 and was introduced by 04dc5f8

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
core changes to core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants