Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(import): import single-layer container images directly #548

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rchincha
Copy link
Contributor

What type of PR is this?

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add repro steps and logs showing the issue:

Testing done on this change:

Automation added to e2e:

Will this break upgrades or downgrades?

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rchincha rchincha changed the title feat(import): import container images directly feat(import): import single-layer container images directly Nov 13, 2023
@rchincha rchincha force-pushed the imp branch 2 times, most recently from fae4444 to 8a9f0a9 Compare November 13, 2023 23:23
Copy link

codecov bot commented Nov 13, 2023

Codecov Report

Attention: 50 lines in your changes are missing coverage. Please review.

Comparison is base (a576aa3) 56.36% compared to head (e7234cc) 57.05%.

Files Patch % Lines
pkg/oci/oci.go 69.51% 17 Missing and 8 partials ⚠️
pkg/stacker/import.go 42.30% 10 Missing and 5 partials ⚠️
pkg/overlay/pack.go 47.36% 6 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #548      +/-   ##
==========================================
+ Coverage   56.36%   57.05%   +0.69%     
==========================================
  Files          64       64              
  Lines        7505     7545      +40     
==========================================
+ Hits         4230     4305      +75     
+ Misses       2553     2493      -60     
- Partials      722      747      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a similar feature to what i was asking for at #448 .

I don't think I like the use of 'import' for it though. I'm not sure exactly why, but it seems a bit unnatural.
Some questions this brings up to me:

  • Does every 'import' that specifies a 'dest' get its own layer? ie:
    import:
      - path: file1.txt
        dest: /etc/file1.txt
      - path: file2.txt
        dest: /etc/file2.txt
    • Do those each get their own layer? I can see benefits to each approach.
  • Why would this feature be limited to "single layer" container images? Why shouldn't it "import and squish" or "import and not squish"?
  • I think this is just an optimization (admittedly useful) of function that is already there. At least for the clean homebrew layout (where each layer contains its own paths without overlap) you could just create a 'ca-certificates' layer that was 'build_only: true' with 'from: docker / url: docker://ghcr.io/homebrew/core/ca-certificates:2022-10-11', and then 'import' the 'stacker://ca-certificates/ca-certificates' layer.

In related comment, i think import needs a fair amount of work (#456, #530, #516, #508).

@rchincha
Copy link
Contributor Author

This is a similar feature to what i was asking for at #448 .

I don't think I like the use of 'import' for it though. I'm not sure exactly why, but it seems a bit unnatural.

I would argue that we already import file paths and stacker:// and the only thing missing is docker://

Some questions this brings up to me:

* Does every 'import' that specifies a 'dest' get its own layer? ie:

Yup.

* Why would this feature be limited to "single layer" container images?  Why shouldn't it "import and squish" or "import and _not_ squish"?

Intention is to initially keep this simple - 1:1 mapping between a layer and pkgname.

* I think this is just an optimization (admittedly useful) of function that is already there.  At least for the clean homebrew layout (where each layer contains its own paths without overlap) you could just create a 'ca-certificates' layer that was 'build_only: true' with 'from: docker / url: docker://ghcr.io/homebrew/core/ca-certificates:2022-10-11', and then 'import' the 'stacker://ca-certificates/ca-certificates' layer.

True, but this is a general case where all such layers are not built in the same place and/or made available in the same registry.

In related comment, i think import needs a fair amount of work (#456, #530, #516, #508).

Looks unrelated at least from code changes pov.

@rchincha
Copy link
Contributor Author

btw, there is definitely interest in the community wrt exactly this feature.

@rchincha
Copy link
Contributor Author

rchincha commented Nov 16, 2023

Why would this feature be limited to "single layer" container images? Why shouldn't it "import and squish" or "import and not squish"?

This is for driving a specific use case of pkgs distributed as layers. Don't expect this to be a main use case immediately.
Also, we don't want to be changing any layers we import. That is a big no-no.

$ crane rebase my-app:latest
--old_base=ubuntu@sha256:deadbeef...
--new_base=ubuntu:latest
--tag=my-app:rebased

https://github.com/google/go-containerregistry/blob/main/cmd/crane/rebase.md

Add support to import and overlay single-layer container images. This
allows for a different model of software distribution in the container
image ecosystem.

---

apt-get install openssl curl ca-certificates

(becomes)

    import:
    - path: docker://ghcr.io/homebrew/core/openssl/1.1:1.1.1k
      dest: /
    - path: docker://ghcr.io/homebrew/core/curl:8.0.1
      dest: /
    - path: docker://ghcr.io/homebrew/core/ca-certificates:2022-10-11
      dest: /

Signed-off-by: Ramkumar Chinchani <[email protected]>
Copy link
Contributor

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Maybe I'm missing something, if so, please tell me what. But the given
example in test/ doesn't work in any way that I can see. curl:8.0.1 provides /opt/curl/8.0.1/bin/curl and your test verifies that the file is present in the compose-img. However, it does not verify that the executable works. It does not work. The curl binary there depends on 33 libraries. Only one of the libraries (libcurl.so.4) is provided by the layer.

At the conceptual level it has problems that I'm not sure how to solve.

The example given here for "single layer images" mentions packages. But in normal distro (and I think with homebrew also) this just isn't realistic as it completely ignores dependencies. You can't use the layer you imported without knowledge of the dependencies. Duplicating the dependency knowledge in a stacker file would be brittle and require lots of maintenance in any real case.

    - path: docker://ghcr.io/homebrew/core/openssl/1.1:1.1.1k
      dest: /
    - path: docker://ghcr.io/homebrew/core/curl:8.0.1
      dest: /opt/
    - path: docker://ghcr.io/homebrew/core/ca-certificates:2022-10-11
      dest: /

This above is taken from the 'compose-img' test that you provided.

I actually really think its valuable to allow "composing" layers of only other layers (see #448). I do not believe that limiting such function to "single layer images" is realistic or helpful.

Doing this right, and supporting the "packages as layers" use case that you
provided is hard. I'm not even sure if its plausible.
Consider a simple goal of composing a image that had "curl" and "bash". Given your syntax, that might look like this:

imports:
  - path: docker://registry/packages/bash
    dest: /
  - path: docker://registry/packages/curl
    dest: /

Bash has a dependency on a libc. curl has a dependency on a libc. We assume each of these images have their dependencies somehow accounted for. But what happens if bash's libc dependency is incompatible with curl's libc dep? bash will just be broken as curl replaced it. The situtation does not get any better with "single layer" images, as presumably each of the layers provides /lib/x86_64-linux-gnu/libc.so.6 and we've still merged 2 namespaces into 1 namespace without any knowlege of how to do that. It just wont work.

I'm not saying that this is impossible to solve. nixos does a lot of this... packages are fully independent of other packages, but I don't think naive combining of package trees would "just work" without duplication or something to "finalize".

- path: docker://ghcr.io/homebrew/core/openssl/1.1:1.1.1k
dest: /
- path: docker://ghcr.io/homebrew/core/curl:8.0.1
dest: /opt/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this work?
You take a layer that is rooted at "/" and place it at "/opt" ? Won't that have to create a new layer (sha256 blob tarball) , thus reducing the benefit and destroying any signatures on the imported layer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you import, you verify the signatures, and when this is done, you would have to re-sign the whole image again

@rchincha
Copy link
Contributor Author

Maybe I'm missing something, if so, please tell me what. But the given
example in test/ doesn't work in any way that I can see. curl:8.0.1 provides /opt/curl/8.0.1/bin/curl and your test verifies that the file is present in the compose-img. However, it does not verify that the executable works. It does not work. The curl binary there depends on 33 libraries. Only one of the libraries (libcurl.so.4) is provided by the laye

homebrew is a pkg mgmt that distributes content via OCI layers, but apart from that it does more. What we will go for is a fully functional OCI layout that doesn't need any additional post-install step. How do we get there if we don't build this in, in the first place.

@smoser
Copy link
Contributor

smoser commented Dec 1, 2023

homebrew is a pkg mgmt that distributes content via OCI layers, but apart from that it does more.

How is one supposed to use that? As I showed, the layers are not usable. They are missing dependencies. What is supposed to resolve those dependencies? How is one supposed to use the curl oci layer? Is it just left as an exercise to the user to find the correct layers to fulfill the 32 different missing layers?

@rchincha
Copy link
Contributor Author

rchincha commented Dec 1, 2023

Is it just left as an exercise to the user to find the correct layers to fulfill the 32 different missing layers?

One has to build a pkg mgmt system around this just as homebrew has done.
https://www.ypsidanger.com/homebrew-is-great-on-linux/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants