feat: conditionally write data to file instead of keeping it in memory #5497
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Submission
Risk assessment is low as this feature is only enabled for GAF workflows larger than 512MB by default and can be manually disabled
Please check the boxes once done.
The pull request must:
feat:
orfix:
, others might be used in rare occasions as well, if there is no need to document the changes in the release notes. The changes or fixes should be described in detail in the commit message for the changelog & release notes.Pull Request Review
All pull requests must undergo a thorough review process before being merged.
The review process of the code PR should include code review, testing, and any necessary feedback or revisions.
Pull request reviews of functionality developed in other teams only review the given documentation and test reports.
Manual testing will not be performed by the reviewing team, and is the responsibility of the author of the PR.
For Node projects: It’s important to make sure changes in
package.json
are also affectingpackage-lock.json
correctly.If a dependency is not necessary, don’t add it.
When adding a new package as a dependency, make sure that the change is absolutely necessary. We would like to refrain from adding new dependencies when possible.
Documentation PRs in gitbook are reviewed by Snyk's content team. They will also advise on the best phrasing and structuring if needed.
Pull Request Approval
Once a pull request has been reviewed and all necessary revisions have been made, it is approved for merging into
the main codebase. The merging of the code PR is performed by the code owners, the merging of the documentation PR
by our content writers.
What does this PR do?
This PR adds a new capability to set a threshold in bytes which determines whether to keep GAF workflow data items in memory, or write to disk.
It also fixes a bug that arises from this where the analytics are being sent after cleaning up temp directories.
Where should the reviewer start?
See related GAF changes.
2 new parameters are considered when conditionally writing data to disk:
SNYK_TMP_PATH
- determines the temp path to write data to, will use the OS specific cache dir if not definedINTERNAL_IN_MEMORY_THRESHOLD_BYTES
- threshold in bytes to determine whether to write data to disk or keep in memory. Defaults to 512MB if not defined; a value of-1
disables this feature.How should this be manually tested?
make clean build
SNYK_LOG_LEVEL=trace ./binary-releases/snyk-<ARCH> code test -d
a. include the env vars
SNYK_TMP_DIR_PATH
andINTERNAL_IN_MEMORY_THRESHOLD_BYTES
as appropriate for each test caseTest case 1 - Set SNYK_TMP_DIR_PATH, do not set INTERNAL_IN_MEMORY_THRESHOLD_BYTES
Expected behaviour
native code workflow passes the configuration when creating new data so should take into account the SNYK_TMP_DIR_PATH and use the default 512MB memory threshold
all other workflows do not pass the configuration when creating new data so should have this feature disabled. i.e. they should all read from memory
Validation
Look for the log messages:
a.
internal.cleanup:4 - Deleted temporary directory: <SNYK_TMP_DIR_PATH>
b.
code.test:1 - payload is []byte, comparing payload size (<SIZE> bytes) to threshold (536870912 bytes)
c.
output:2 - memory threshold feature disabled, keeping payload in memory
Test case 2 - Set INTERNAL_IN_MEMORY_THRESHOLD_BYTES, do not set SNYK_TMP_DIR_PATH
Expected behaviour
native code workflow uses the INTERNAL_IN_MEMORY_THRESHOLD_BYTES value and uses the default SNYK_TMP_DIR_PATH value
all other workflows should have this feature disabled. i.e. they should all read from memory
Validation
Look for the log messages:
a.
internal.cleanup:4 - Deleted temporary directory: <OS_SPECIFIC_CACHE_DIR>
b.
code.test:1 - payload is []byte, comparing payload size (<SIZE> bytes) to threshold (<INTERNAL_IN_MEMORY_THRESHOLD_BYTES> bytes)
c.
output:2 - memory threshold feature disabled, keeping payload in memory
Test case 3 - Set SNYK_TMP_DIR_PATH and INTERNAL_IN_MEMORY_THRESHOLD_BYTES
Expected behaviour
native code workflow passes the configuration when creating new data so should take into account both SNYK_TMP_DIR_PATH and INTERNAL_IN_MEMORY_THRESHOLD_BYTES
all other workflows should have this feature disabled. i.e. they should all read from memory
Validation
Look for the log messages:
a.
internal.cleanup:4 - Deleted temporary directory: <SNYK_TMP_DIR_PATH>
b.
code.test:1 - payload is []byte, comparing payload size (<SIZE> bytes) to threshold (<INTERNAL_IN_MEMORY_THRESHOLD_BYTES> bytes)
c.
output:2 - memory threshold feature disabled, keeping payload in memory
Test case 4 - do not set SNYK_TMP_DIR_PATH or INTERNAL_IN_MEMORY_THRESHOLD_BYTES
Expected behaviour
native code workflow passes the configuration when creating new data so should use the default SNYK_TMP_DIR_PATH and INTERNAL_IN_MEMORY_THRESHOLD
all other workflows should have this feature disabled. i.e. they should all read from memory
Validation
Look for the log messages:
a.
internal.cleanup:4 - Deleted temporary directory: <OS_SPECIFIC_CACHE_DIR>
b.
code.test:1 - payload is []byte, comparing payload size (<SIZE> bytes) to threshold (536870912 bytes)
c.
output:2 - memory threshold feature disabled, keeping payload in memory
Test case 5 - Set INTERNAL_IN_MEMORY_THRESHOLD_BYTES=-1
Expected behaviour
all workflows should have feature disabled. i.e. they should all read from memory
Validation
Look for the log messages:
a.
internal.cleanup:4 - Deleted temporary directory: <OS_SPECIFIC_CACHE_DIR>
b.
code.test:1 - memory threshold feature disabled, keeping payload in memory
c.
output:2 - memory threshold feature disabled, keeping payload in memory
Any background context you want to provide?
What are the relevant tickets?
CLI-508
Screenshots
Additional questions