Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: oom during remediation annotations on large number of issues #5284

Closed
wants to merge 1 commit into from

Conversation

cmars
Copy link
Contributor

@cmars cmars commented Jun 3, 2024

When there are a large number of SCA issues, remediation information is added in nested loops. Excessive allocation in the innermost loop was causing an OOM in situations where the product of vulnerabilities, issues, and paths were sufficiently high enough.

This modifies issue data in-place rather than creating a copy. This probably violates type constraints; casting those concerns away in the interest of memory resources.

Fixes CLI-261, CLI-248.

Notes

If I run this with --json-file-output AND comment out the JSON.stringify in jsonStringifyLargeObject I can get a large container to write JSON output.

npm run dev -- container test --platform=linux/amd64 --json-file-output=acryldata.json acryldata/datahub-ingestion:v0.11.0.4 produces a lot of text output, and a rather large JSON output:

% ls -l acryldata.json 
-rw-r--r--  1 c  staff  55496110451 Jun  3 13:32 acryldata.json

Writing the JSON output took about 10m longer to complete on a mac m1, when compared to running without JSON output.

So I think this needs more work... replacing that JSON.stringify with streaming output should address the bug completely.

The JSON.stringify, even in a try block, still crashes ts-node on my machine.


Pull Request Submission

Please check the boxes once done.

The pull request must:

  • Reviewer Documentation
    • follow CONTRIBUTING rules
    • be accompanied by a detailed description of the changes
    • contain a risk assessment of the change (Low | Medium | High) with regards to breaking existing functionality. A change e.g. of an underlying language plugin can completely break the functionality for that language, but appearing as only a version change in the dependencies.
    • highlight breaking API if applicable
    • contain a link to the automatic tests that cover the updated functionality.
    • contain testing instructions in case that the reviewer wants to manual verify as well, to add to the manual testing done by the author.
    • link to the link to the PR for the User-facing documentation
  • User facing Documentation
    • update any relevant documentation in gitbook by submitting a gitbook PR, and including the PR link here
    • ensure that the message of the final single commit is descriptive and prefixed with either feat: or fix: , others might be used in rare occasions as well, if there is no need to document the changes in the release notes. The changes or fixes should be described in detail in the commit message for the changelog & release notes.
  • Testing
    • Changes, removals and additions to functionality must be covered by acceptance / integration tests or smoke tests - either already existing ones, or new ones, created by the author of the PR.

Pull Request Review

All pull requests must undergo a thorough review process before being merged.
The review process of the code PR should include code review, testing, and any necessary feedback or revisions.
Pull request reviews of functionality developed in other teams only review the given documentation and test reports.

Manual testing will not be performed by the reviewing team, and is the responsibility of the author of the PR.

For Node projects: It’s important to make sure changes in package.json are also affecting package-lock.json correctly.

If a dependency is not necessary, don’t add it.

When adding a new package as a dependency, make sure that the change is absolutely necessary. We would like to refrain from adding new dependencies when possible.
Documentation PRs in gitbook are reviewed by Snyk's content team. They will also advise on the best phrasing and structuring if needed.

Pull Request Approval

Once a pull request has been reviewed and all necessary revisions have been made, it is approved for merging into
the main codebase. The merging of the code PR is performed by the code owners, the merging of the documentation PR
by our content writers.

What does this PR do?

Where should the reviewer start?

How should this be manually tested?

Any background context you want to provide?

What are the relevant tickets?

Screenshots

Additional questions

Copy link
Contributor

github-actions bot commented Jun 3, 2024

Warnings
⚠️

Since the CLI is unifying on a standard and improved tooling, we're starting to migrate old-style imports and exports to ES6 ones.
A file you've modified is using either module.exports or require(). If you can, please update them to ES6 import syntax and export syntax.
Files found:

  • src/lib/json.ts
⚠️

You've modified files in src/ directory, but haven't updated anything in test folder. Is there something that could be tested?

Generated by 🚫 dangerJS against 1cec114

When there are a large number of SCA issues, remediation information is added
in nested loops. Excessive allocation in the innermost loop was causing an OOM
in situations where the product of vulnerabilities, issues, and paths were
sufficiently high enough.

This modifies issue data in-place rather than creating a copy. This probably
violates type constraints; casting those concerns away in the interest of
memory resources.

Fixes CLI-261, CLI-248.
@cmars cmars force-pushed the fix/oom-annotated-issues branch from e06ca06 to 1cec114 Compare June 3, 2024 22:32
@cmars cmars closed this Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant