-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qemu-multiarch support #592
Comments
@rodnymolina in case he has further info on where we are at with running multiarch builds with Sysbox. |
To move forward with this, we first need to allow containerized processes to write into the Unfortunately, this isn't a trivial task, since this Notice that the solutions that currently exist to address this issue at the host level are not applicable to our case since they require the execution of |
I've found I can do multi-arch buildx builds on a dockerd running inside an unprivileged sysbox-runc container as long as the underlying OS where sysbox-runc is installed already has the qemu binaries registered under If so, is the scope of this issue limited to just modifying or registering new binfmt_misc entries? In that case, I'd suggest we update the limitations page to clarify this. |
Interesting, could you elaborate on how you installed qemu binaries there? |
Sure. This is a bit long, so I'll collapse it! Details(I found this post useful, I was kind of freestyling with ideas from it: https://medium.com/@artur.klauser/building-multi-architecture-docker-images-with-buildx-27d80f7e2408 .) The top-level host OS is Ubuntu 20.04. I installed the $ ls /proc/sys/fs/binfmt_misc/
python3.8 qemu-alpha qemu-armeb qemu-hppa qemu-microblaze qemu-mips64 qemu-mipsel qemu-mipsn32el qemu-ppc64 qemu-ppc64le qemu-riscv64 qemu-sh4 qemu-sparc qemu-sparc64 qemu-xtensaeb status
qemu-aarch64 qemu-arm qemu-cris qemu-m68k qemu-mips qemu-mips64el qemu-mipsn32 qemu-ppc qemu-ppc64abi32 qemu-riscv32 qemu-s390x qemu-sh4eb qemu-sparc32plus qemu-xtensa register
$ cat /proc/sys/fs/binfmt_misc/qemu-aarch64
enabled
interpreter /usr/bin/qemu-aarch64-static
flags: OCF
offset 0
magic 7f454c460201010000000000000000000200b700
mask ffffffffffffff00fffffffffffffffffeffffff I've got sysbox-runc 0.5.2 installed too, and dockerd 20.10.14. I've run buildkitd under sysbox-runc successfully in two ways. I'm setting up buildkitd to use with a GitLab CI runner. I'd initially tried and failed to do multiarch builds when running dockerd inside the GitLab CI Runner's container (which is how I found this issue). So after that I had been planning to set up a standalone buildkitd and share it with the runners as a remote builder, so setting that up is how I noticed that it actually works under sysbox-runc (after installing the qemu binaries on the top-level host OS):
So this container seems to inherit the same binfmt registrations as the host: $ docker container exec -it remote-buildkitd sh
/ # ls /proc/sys/fs/binfmt_misc/
python3.8 qemu-arm qemu-hppa qemu-mips qemu-mipsel qemu-ppc qemu-ppc64le qemu-s390x qemu-sparc qemu-xtensa status
qemu-aarch64 qemu-armeb qemu-m68k qemu-mips64 qemu-mipsn32 qemu-ppc64 qemu-riscv32 qemu-sh4 qemu-sparc32plus qemu-xtensaeb
qemu-alpha qemu-cris qemu-microblaze qemu-mips64el qemu-mipsn32el qemu-ppc64abi32 qemu-riscv64 qemu-sh4eb qemu-sparc64 register
/ # cat /proc/sys/fs/binfmt_misc/qemu-aarch64
enabled
interpreter /usr/bin/qemu-aarch64-static
flags: OCF
offset 0
magic 7f454c460201010000000000000000000200b700
mask ffffffffffffff00fffffffffffffffffeffffff The actual binaries don't exist in the container though:
Then I can run another container with docker CLI in that network and connect it to this builder as a remote builder:
And I just made a simple Dockerfile to do a test build, I think it was just: FROM alpine
RUN echo "Hi from $(uname -a)" And this builds successfully in arm64 and amd64:
So that's fine. But with the qemu binaries installed on the host, I'm able to just start a buildkit builder on a dockerd running inside a CI build container. The CI runner uses the Docker executor. My config for the Docker executor is this: # ...
[runners.docker]
tls_verify = true
image = "ubuntu:20.04"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/certs/client", "/cache", "/var/lib/docker"]
shm_size = 0
runtime = "sysbox-runc"
extra_hosts = ["gitlab-ci-minio:host-gateway"] (I guess I probably shouldn't share $ docker buildx create --use
$ docker buildx bake --push So buildkitd is running under the dockerd running in the CI job's container. And this builds a real image with plenty of stuff in, not just the toy Dockerfile above. |
Thank you @h4l! With some minor modifications we got it working in our Buildkite environment as well! All our modifications where tweaks around using an existing per-job network as well as loading up the buildx configuration in every builds environment hooks :) |
@mhornbacher Nice job, that's great, glad you got it working! |
@rodnymolina @ctalledo I think this is a fairly workable solution for most customers. Big shoutout to @h4l for his discoveries! For anyone stumbling upon this via a search.
Implementation details are in the above two comments 👍 Feel free to @ me with a comment here if you need more explanation |
Thanks @mhornbacher for your contribution to resolving this issue, much appreciated! |
Just a heads-up that the workaround mentioned above to allow multi-arch builds within a sysbox container is not working as of v0.6.2 release (thanks to @DekusDenial for letting us know). This is a consequence of a recent change to ensure that Sysbox exposes all the procfs and sysfs nodes known within the container's namespaces. As a side-effect of this, we stopped exposing a few host nodes within sysbox containers, which on one hand offers greater security, but on the other, it breaks functionality like the one required by the above workaround. As a fix, we could make an exception for these nodes by creating a new 'handler' in sysbox-fs to expose the host's |
Thanks. Pinging @jmandel1027 :) |
I have same problem to run CICD, any workaround? thanks. |
You can use version 0.6.1 with the workarounds detailed above if you need mutli arch builds for now. This is no longer my day-to-day so I am not on top of any 0.6.2 workarounds yet |
I am using gitea with self host act_runner , seems to not use moby/buildkit at all.
|
Hi @DekusDenial,
I am not sure; I see a few related commits, but nothing that specifically addresses that issue. How can I reproduce the problem? |
I enabled qemu on a host and run sysbox container but /proc/sys/fs/binfmt_misc was empty and didnt inherit from the host |
I see; no unfortunately the ability to expose (i.e., namespace) binfmt_misc inside the Sysbox container is not present in this release. And exposing the host's binfmt_misc inside the container (as Sysbox used to do by mistake) is not a good solution because it's a global resource in the system (i.e., if a container modifies the binary associated with a file-type, all other containers would be affected by the change). Ideally binfmt_misc needs to be per-Sysbox container. It's not a simple task because it requires Sysbox to emulate binfmt_misc inside the container. It can be done but it's a difficult task, and we've not yet had the chance to work on it unfortunately. |
That’s the reason we still pin to 0.6.1 given the situation in which we have no other mean to enable qemu for multi-arch workload. |
0.6.1 support it out of the box? |
Got it; would the work-around in the comment above help? |
Unfortunately no because our setup and use case is not the same as above, and based on the reporter’s comment above this issue also prevented them from moving to 0.6.2 and the work around may no longer work. |
I see; not sure how to help then: we can't go back to the v0.6.1 behavior because it breaks container isolation (i.e., it allows a container to modify a globally shared system resource, the binfmt_misc subsystem). But on the other hand the proper fix is a heavy lift. The only thing I can think of is adding a config option in Sysbox that allows the container to access the host's binfmt_misc; the config would be set per-container, via an env variable (e.g., |
I think most people would prefer this config as a work around for qemu. FYI, I used to have workload scheduled on a kata-container in which user can register qemu on-demand via privileged docker but this won’t work inside a sysbox container, that’s why I have been relying on the host qemu to pre-provide qemu. Meanwhile, if you can provide the sources where this config would be implemented or another word where this binfmt_misc resource would be excluded from isolation, people can patch it on their side. |
It's a bit more complicated; the work would be in sysbox-fs (the component that emulates portions of For the emulated files or directories within In addition, we would need to add the code that enables the feature on a per-container basis. That requires changes in sysbox-runc and the transport that sends the config to sysbox-fs. It's not super difficult, but it's not a simple change either. For someone that knows the code, it's a few days of work (we also have to write the tests). If you wish to contribute (we appreciate that!), let me know and I can provide more orientation. Otherwise it will have to wait until we have the cycles, as we balance Sysbox development & maintenance with other work at Docker. |
I really look forward to this feature being available soon. |
@ctalledo saw your comment regarding I have to support significant docker multiarch workload in my workspace so technically I am still on 0.6.1. And with the fix for runc 1.2.x and for #879 is just around the corner, I am kinda stuck. |
Multi-arch buildx builds currently do not work on the sysbox runtime due to lack of support for this feature.
As per this slack message it is actively being worked on and this ticket is to track progress.
The text was updated successfully, but these errors were encountered: