Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--exclude-pkgs option #1229

Open
rchincha opened this issue Sep 27, 2022 · 12 comments
Open

--exclude-pkgs option #1229

rchincha opened this issue Sep 27, 2022 · 12 comments
Labels
enhancement New feature or request good-first-issue Good for newcomers

Comments

@rchincha
Copy link

What would you like to be added:

Once packages are discovered using the cataloger, can I specify a list of packages to be excluded?

Why is this needed:

Generate a SBOM only for a subset of packages.

Additional context:

@rchincha rchincha added the enhancement New feature or request label Sep 27, 2022
@spiffcs
Copy link
Contributor

spiffcs commented Sep 28, 2022

Thanks for the issue @rchincha!

Can you walk us through the reasoning for excluding packages? We want syft to be as close to the truth as possible when generating an SBOM.

Allowing users to exclude or omit packages that are present and cataloged seems a little outside of that goal.

Definitely happy to talk through how you would use it!

@spiffcs spiffcs added this to OSS Sep 28, 2022
@spiffcs spiffcs moved this to Parking Lot (Comments or Progress) in OSS Sep 28, 2022
@spiffcs
Copy link
Contributor

spiffcs commented Sep 28, 2022

We've also added this to the agenda for tomorrow's community meeting for syft and grype. Feel free to join there as well and will get other feedback from the community!

https://twitter.com/GrypeProject/status/1574431163799801856?cxt=HHwWgMC8maizwNkrAAAA

@rchincha
Copy link
Author

Thanks for the issue @rchincha!

Can you walk us through the reasoning for excluding packages? We want syft to be as close to the truth as possible when generating an SBOM.

Allowing users to exclude or omit packages that are present and cataloged seems a little outside of that goal.

Definitely happy to talk through how you would use it!

Thanks @spiffcs.

We have a situation where we build a chroot without the package db of any sort. So in order to get the syft's sbom capability, we setup a separate environment with some base distro install, install required packages on top of it and would now like to exclude the packages in the base install if appropriate. Alternatively, instead of a blacklist (--exclude), perhaps a whitelist (--include) will work better. Hope the problem statement is clear.

@rchincha
Copy link
Author

rchincha commented Oct 5, 2022

Any additional thoughts/updates on this?

@kzantow
Copy link
Contributor

kzantow commented Oct 6, 2022

Hi @rchincha, we discussed this at the community meeting last week (see the notes here). If I understand the use case you're talking about it's less about excluding packages and more about only including user-defined packages (but excluding the base image packages), is this correct? If so, this is something that has been asked for before and something we'd like to do. We have the concepts of scopes but currently only squashed and all-layers. We would add another scope something like user-layers and I suspect would be easier to for you to use than an explicit exclude list of packages, what do you think?

@spiffcs spiffcs added the good-first-issue Good for newcomers label Oct 6, 2022
@rchincha
Copy link
Author

rchincha commented Oct 6, 2022

@kzantow, yes spot-on our requirement.

Your suggestion about user-layers could work also - I assume you will work out what that would mean in terms of how one would figure out which the user-layers are. For us, given a base set of layers, we can install all our packages in a new layer, then scanning and reporting from that new layer alone could work.

https://github.com/anchore/syft#sbom
^ also could you expand a bit more about squashed and all-layers. What is the difference? Perhaps an example or two, for our understanding.

@kzantow
Copy link
Contributor

kzantow commented Oct 6, 2022

@rchincha the idea is we would just exclude the layers from the base image, so any layers you add from your own Dockerfile would be included. I'm not sure we've worked out every detail here, but that's the gist.

As for squashed vs all-layers:

  • squashed: only scans the final layer filesystem
  • all-layers: scans each layer in the image individually

The difference here is all-layers would find things that were present at one point, but removed before the final filesystem.

@rchincha
Copy link
Author

rchincha commented Oct 6, 2022

@kzantow thanks for the clarification, it is the deletions that make the two options different.

About user-layers, is there an ETA to expect. We don't mind pitching in if it helps expedite.

@kzantow
Copy link
Contributor

kzantow commented Oct 6, 2022

@rchincha we do not currently have an ETA for this, but of course PRs are welcome! FYI - I believe this change would probably need to be done predominantly in the stereoscope library, which Syft relies on for processing images.

@rchincha
Copy link
Author

@kzantow after thinking about this some more, also wondering if an --offline option is feasible.

Most package managers, given a package name/version, can also list files included in the package and files to be installed.

$ dpkg -l curl
ii  curl           7.81.0-1ubuntu1.4 amd64        command line tool for transferring data with URL syntax

$ dpkg-query -L curl
/.
/usr
/usr/bin
/usr/bin/curl
...
etc

So the question is can one simply pass the package name/version and its constituent list of files and generate a SPDX document? This of course will be orthogonal to grokking container images.

@kzantow
Copy link
Contributor

kzantow commented Oct 11, 2022

@rchincha I don't quite follow --offline in this context, but if you're thinking about providing Syft with a list of packages and/or files, this might be feasible way to do things. We are working on having a way to catalog SBOMs we find on the file system, and we could potentially add a "simple" SBOM format that's like a CSV or text file.

@rchincha
Copy link
Author

"but if you're thinking about providing Syft with a list of packages and/or files, this might be feasible way to do things."
exactly this ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good-first-issue Good for newcomers
Projects
Status: No status
Development

No branches or pull requests

4 participants