Disable k3s embedded containerd process, start k3s-containerd binary as external process #3478

andrewd-zededa · 2023-10-03T18:24:50Z

The embedded k3s containerd instance does not allow changing the containerd root path away from a subdirectory of the k3s-data-dir as they specify a root on command line args overriding whatever is in the config.toml file. So we will disable the embedded k3s containerd process and instead use the k3s supplied containerd binary to start a containerd instance next to k3s (still in the kube container) which will allow us to allow us to change the containerd root. This containerd instance will re-use the pillar /run/containerd-user/containerd.sock to allow ease of reusing wait-ready code on the pillar side.

Add missing findutils package for longhorn.
Raise io priority for embedded etcd.
Add Pramodhs containerd.go change to skip starting containerd-user in pillar.

Signed-off-by: Andrew Durbin <[email protected]>

…still use the k3s supplied binary to allow us to change the containerd root. Add missing findutils package for longhorn. Raise io priority for embedded etcd. Add Pramodhs containerd.go change to skip starting containerd-user in pillar. Signed-off-by: Andrew Durbin <[email protected]>

deitch · 2023-10-03T18:52:01Z

I find it surprising that k3s doesn't let you select a different containerd or control the path, as that is the kind of thing they would, but maybe they never got to it.

The two questions I have are:

why do we care? It's all inside of the container or VM anyways, so why do we care?
what do you mean by reusing the pillar containerd socket? That would mean it's talking to the main containerd. Why would we want that?

codecov · 2023-10-03T18:54:11Z

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (9862c3b) 20.32% compared to head (e4ae2ce) 20.30%.
Report is 97 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3478      +/-   ##
==========================================
- Coverage   20.32%   20.30%   -0.02%     
==========================================
  Files         198      198              
  Lines       45230    45246      +16     
==========================================
- Hits         9192     9187       -5     
- Misses      35355    35374      +19     
- Partials      683      685       +2

Files	Coverage Δ
pkg/pillar/containerd/containerd.go	`4.53% <0.00%> (-0.13%)`	⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

zedi-pramodh · 2023-10-03T20:09:03Z

@deitch

To answer your questions.

We care because we want to download apps using eve download mechanism which is a known secure way and store blobs/images the way we do today. In other words we do not depend on Kubernetes image download mechanism

2)I think Andrew meant that we will be using pillar user containerd not the main containerd. So basically we will be using /run/containerd-user/containerd.sock that is the socket access point for deployed images.

deitch · 2023-10-04T08:01:19Z

We care because we want to download apps using eve download mechanism which is a known secure way and store blobs/images the way we do today. In other words we do not depend on Kubernetes image download mechanism

Thanks for the explanation; this definitely helps.

To the best of my knowledge, Kubernetes actually does not have a download mechanism. It depends entirely on the container runtime (in this case, containerd). Either way, pointing k3s at a different containerd doesn't change its download flow. If you ask it to run a pod, and the image is not there, it will use the containerd API (not precisely, but close enough) to download the image. If containerd has network access, it will do so.

This raises an interesting question, which @eriknordmark is likely to know as well. Right now, I do not think our containerd is network-blocked. We do not use it to download - instead we use the volumemgr+downloader+verifier and load it into containerd, and then use containerd to run it. But is containerd actually blocked from downloading? I don't think it is. Maybe we configure it to have no network access, but I am somewhat sure we do not have that.

So pointing at a different containerd does not change the flow. If k3s is asked for a container, it will ask containerd to get that image unless available. Mind, if ImagePullPolicy is set to always, then it always will ask it to pull it.

I think we have some experiments here to determine:

can user containerd pull images directly or is it blocked; ie is it not downloading because we do not use it, or because it cannot?
if it is not used, then what happens if a container has ImagePullPolicy=always? I am almost certain it will try to pull it

we will be using pillar user containerd not the main containerd

We currently have 2 containerd running: system (runs pillar, etc.) and user (runs user workloads, i.e. ECOs). I think you are saying, k3s will add another, which would make it 3. Change it so that it remains only 2, and k3s points at the user containerd managed by pillar?

If that is so, 100% in favour. I still am surprised you cannot just point k3s at an existing containerd, but you did the research, so I must put my surprise aside and accept it.

Summary:

We definitely should use the user containerd, however we get that (this PR ✅ )
We should check if the user containerd is blocked from downloading or just we never happen to use it
If the latter, we need to find a way to ensure k3s doesn't trigger it

zedi-pramodh · 2023-10-04T17:36:07Z

@deitch You brought up good points let me summarize it for you.

Yes k3s supports external containerd, but k3s ships with a compatible containerd which they test extensively. Hence it is in our best interest to use containerd shipped with k3s. So, basically on kubevirt eve we will not start user containerd from eve and start containerd shipped by k3s but use the same socket endpoint (/run/containerd-user/containerd.sock) and repository location (/persist/vault/containerd) for code compatibility.
Yes containerd itself will not pull the images, it has to be triggered from Kubernetes. But our approach is that eve will always pull the image and make it available for k3s. And yes we are setting the flag ImagePullPolicy=Never. If image is not found the app launch will fail. That you will see in different PR. This project will have multiple PRs, we cannot put everything in one PR. We are trying to divide those as independent as possible.

deitch · 2023-10-04T18:35:56Z

Yes k3s supports external containerd, but k3s ships with a compatible containerd which they test extensively. Hence it is in our best interest to use containerd shipped with k3s. So, basically on kubevirt eve we will not start user containerd from eve and start containerd shipped by k3s but use the same socket endpoint (/run/containerd-user/containerd.sock) and repository location (/persist/vault/containerd) for code compatibility.

That might be a pretty big change. Have we tested the shipped version with everything else that we do that depends on containerd? Version and functionality mismatches? CVEs opened or closed?

It might be fine, but we definitely need to consciously decide, "we are ok switching the specific version we are using".

Yes containerd itself will not pull the images, it has to be triggered from Kubernetes

You can get containerd to pull directly, whether using ctr or the gRPC API from somewhere other than Kubernetes, but we might not here.

But our approach is that eve will always pull the image and make it available for k3s

Yes, makes sense. That has been our philosophy all along.

And yes we are setting the flag ImagePullPolicy=Never

That probably should do it. I think we still should test if containerd has the ability to bypass our downloading. It is pretty simple:

Launch an eve device
Connect to user containerd via ctr
Do an image pull
See what happens

zedi-pramodh · 2023-10-04T20:25:45Z

Yes k3s supports external containerd, but k3s ships with a compatible containerd which they test extensively. Hence it is in our best interest to use containerd shipped with k3s. So, basically on kubevirt eve we will not start user containerd from eve and start containerd shipped by k3s but use the same socket endpoint (/run/containerd-user/containerd.sock) and repository location (/persist/vault/containerd) for code compatibility.

That might be a pretty big change. Have we tested the shipped version with everything else that we do that depends on containerd? Version and functionality mismatches? CVEs opened or closed?

It might be fine, but we definitely need to consciously decide, "we are ok switching the specific version we are using".

I am not sure I understood your question correctly. This is a new eve flavor and we will not be upgrading any existing kvm based installs. This is a new product altogether and we will be starting with containerd that comes with k3s.

Yes containerd itself will not pull the images, it has to be triggered from Kubernetes

You can get containerd to pull directly, whether using ctr or the gRPC API from somewhere other than Kubernetes, but we might not here.

But our approach is that eve will always pull the image and make it available for k3s

Yes, makes sense. That has been our philosophy all along.

And yes we are setting the flag ImagePullPolicy=Never

That probably should do it. I think we still should test if containerd has the ability to bypass our downloading. It is pretty simple:

Launch an eve device

Connect to user containerd via ctr

Do an image pull

See what happens

Yes containerd can pull images if we explicitly ask it to, but as a product we are not doing that and will not do that. No user is logging into the device to do that manually. So basically nothing changes from what we have now.

deitch · 2023-10-05T07:10:55Z

This is a new eve flavor and we will not be upgrading any existing kvm based installs. This is a new product altogether and we will be starting with containerd that comes with k3s

I wasn't worried about upgrades. We still are releasing a new eve, which has lots of components that depend on that containerd. It likely will be just fine, but we should not change the version or location of a central component in eve without validating first that we are ok.

To put it in other terms: let's say that the current version is 1.7.0, and k3s comes with 1.7.6. We should first check that we are ok with 1.7.6 in place, and then switch. Conversely, if we are running 1.7.6, and k3s is running 1.6.5, then we very well might have compatibility issues.

Yes containerd can pull images if we explicitly ask it to, but as a product we are not doing that and will not do that. No user is logging into the device to do that manually. So basically nothing changes from what we have now

Which is good. We should, however, disable containerd from doing that entirely, as long as that is our philosophy, and not rely on, "we don't do that". This is not an issue for your deliverable here; you just happen to be lucky enough to have brought it to the fore by having another component that communicates with containerd.

zedi-pramodh · 2023-10-05T13:43:55Z

I wasn't worried about upgrades. We still are releasing a new eve, which has lots of components that depend on that containerd. It likely will be just fine, but we should not change the version or location of a central component in eve without validating first that we are ok.

To put it in other terms: let's say that the current version is 1.7.0, and k3s comes with 1.7.6. We should first check that we are ok with 1.7.6 in place, and then switch. Conversely, if we are running 1.7.6, and k3s is running 1.6.5, then we very well might have compatibility issues.

I am sorry but this is what I am not getting. We do not run any different user containerd version in eve. We just run the version that comes with k3s. So the compatibility between k3s and user containerd is always maintained.

deitch · 2023-10-05T13:56:47Z

Now I am confused. We already install a containerd in eve. Didn't we write above that we will replace that one with the one that comes with k3s?

zedi-pramodh · 2023-10-05T14:04:26Z

Now I am confused. We already install a containerd in eve. Didn't we write above that we will replace that one with the one that comes with k3s?

We won't replace the one that that comes with eve. We just do not start the one that comes with eve and start the one that comes with k3s.

deitch · 2023-10-05T15:06:03Z

That is replacing it. It doesn't matter if we have it on disk or not, it matters if we use it.

Again, it could be fine, but now we are tying our eve containerd version to the k3s one. That is a change - completely independent of k3s usage - that requires testing.

BTW, what happens when we need to upgrade containerd to some version for a feature, if k3s isn't there yet?

zedi-pramodh · 2023-10-05T15:51:24Z

That is replacing it. It doesn't matter if we have it on disk or not, it matters if we use it.

Again, it could be fine, but now we are tying our eve containerd version to the k3s one. That is a change - completely independent of k3s usage - that requires testing.

BTW, what happens when we need to upgrade containerd to some version for a feature, if k3s isn't there yet?

This change is only in kubevirt eve and will not effect kvm based eve. So we will be always using containerd that comes with k3s on kubevirt eve only. This does not effect our existing kvm eve and the upgrades of it.

Are you concerned about eve containerd client code working with containerd from k3s ? Yes that will be tested, so far we did not see any issue in our testing.

deitch · 2023-10-06T08:00:30Z

Yes, I am concerned about that. I also am concerned about having variants on eve that change fundamental compatibility components.

zedi-pramodh · 2023-10-06T15:49:41Z

Yes, I am concerned about that. I also am concerned about having variants on eve that change fundamental compatibility components.

I will be happy to get on call and learn about your concerns. From my point of view this commit or upcoming commits are no-op on existing eve ie does not change anything. Only changes are in new kubevirt eve.

andrewd-zededa · 2023-10-06T22:31:22Z

@deitch

To address the question of containerd versions here are each so it does appear the k3s one is at least one minor version behind.

k3s version:

8c857dec-7cb8-45cb-8e20-a6c7569446c4:/# /var/lib/rancher/k3s/data/current/bin/containerd -v
containerd github.com/k3s-io/containerd v1.6.19-k3s1

existing pillar version:

(ns: pillar) 8c857dec-7cb8-45cb-8e20-a6c7569446c4:/# /usr/bin/containerd -v
containerd github.com/containerd/containerd v1.7.2 0cae528dd6cb557f7201036e9f43420650207b58

andrewd-zededa · 2023-10-06T23:05:39Z

@deitch @zedi-pramodh

We're currently on k3s version v1.26.3+k3s1. If we upgrade to the latest k3s v1.28.2+k3s1 then we'll get a containerd upgrade to v1.7.6-k3s1.

deitch · 2023-10-08T06:44:38Z

We are opening a Pandora's box (which, ironically, probably wasn't a box, as they didn't really have "boxes" back then 😄 ), in having different versions of containerd on different versions of eve which are almost the same but not quite. That is my concern. We should have eve, with its tested and supported version of its components, and things that plug into it and are tested with it.

I will be happy to get on call and learn about your concern

Sure, let's coordinate offline.

andrewd-zededa · 2023-10-23T15:57:40Z

Re-opening from @zedi-pramodh 's request. Were there any requested changes to this PR from the offline conversation @deitch and @zedi-pramodh ?

zedi-pramodh · 2023-10-23T16:27:13Z

Not that I know of @andrewd-zededa. @deitch reviewed the design document and in general agreed with this approach. But I will leave it to him to review and approve.

eriknordmark · 2023-10-24T09:34:55Z

pkg/kube/cluster-init.sh

@@ -83,6 +120,14 @@ setup_prereqs

 date >> $INSTALL_LOG
 HOSTNAME=$(/bin/hostname)
+logmsg "Starting wait for hostname, currently: $HOSTNAME"
+while [[ $HOSTNAME = linuxkit* ]];


What are you trying to check?
It is the case that EVE comes up with a linuxkit* name early in boot, but then the hostname is changed to be the device UUID string. So I wouldn't depend on it being linuxkit* if you are trying to check whether this is running EVE-OS.

If you want to check whether or not the device has been onboarded then we have a way to do that using pubsub and you can potentially use the pubsub checkpoint file in /persist/status/zedclient/OnboardingStatus/global.json to check this.

I could have added a good comment in the script there. This was added to avoid a hang further down in startup where k3s node ready state was polled looking for a node in the ready state with the linuxkit* name (which doesn't exist anymore at this point in runtime). The k3s check ready path has been refactored since the hostname poll change was put in place. With fresh eyes now I think a smaller change could be in check_node_ready() and just read a fresh copy of the system hostname.

eriknordmark

I see several merge commits and those need to be removed since can not be pulled to master.

FWIW I use git fetch upstream master; git rebase -i upstream/master
to avoid ever getting any merge commits.

Also there are a number of yetus issue to address in the script(s).

Signed-off-by: Andrew Durbin <[email protected]>

eriknordmark · 2023-10-30T19:17:59Z

Converted to draft since with the merge commits this is clearly not ready for review.

andrewd-zededa · 2023-10-30T20:02:27Z

Sorry I'm just closing this to reopen with a cleaner pr, the rebase isn't happening cleanly.

andrewd-zededa and others added 8 commits June 19, 2023 14:45

Handle missing default machine type in qemu-system-aarch64 on Apple M2

23496c3

Signed-off-by: Andrew Durbin <[email protected]>

Merge branch 'lf-edge:master' into master

45644c1

Merge branch 'lf-edge:master' into master

ee1092a

Merge branch 'lf-edge:master' into master

6e8de69

Merge branch 'lf-edge:master' into master

ff4413c

Merge branch 'lf-edge:master' into master

59749b6

Merge branch 'lf-edge:master' into master

5cd81cb

andrewd-zededa requested review from deitch, rouming and zedi-pramodh as code owners October 3, 2023 18:24

andrewd-zededa closed this Oct 3, 2023

andrewd-zededa reopened this Oct 23, 2023

eriknordmark reviewed Oct 24, 2023

View reviewed changes

Condense hostname reading to k3s ready check

e4ae2ce

Signed-off-by: Andrew Durbin <[email protected]>

eriknordmark marked this pull request as draft October 30, 2023 19:17

andrewd-zededa closed this Oct 30, 2023

andrewd-zededa mentioned this pull request Oct 30, 2023

Disable k3s embedded containerd process, start k3s-containerd binary as external process #3548

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable k3s embedded containerd process, start k3s-containerd binary as external process #3478

Disable k3s embedded containerd process, start k3s-containerd binary as external process #3478

andrewd-zededa commented Oct 3, 2023

deitch commented Oct 3, 2023

codecov bot commented Oct 3, 2023 •

edited

Loading

zedi-pramodh commented Oct 3, 2023

deitch commented Oct 4, 2023

zedi-pramodh commented Oct 4, 2023

deitch commented Oct 4, 2023

zedi-pramodh commented Oct 4, 2023

deitch commented Oct 5, 2023

zedi-pramodh commented Oct 5, 2023

deitch commented Oct 5, 2023

zedi-pramodh commented Oct 5, 2023

deitch commented Oct 5, 2023

zedi-pramodh commented Oct 5, 2023

deitch commented Oct 6, 2023

zedi-pramodh commented Oct 6, 2023

andrewd-zededa commented Oct 6, 2023

andrewd-zededa commented Oct 6, 2023

deitch commented Oct 8, 2023

andrewd-zededa commented Oct 23, 2023 •

edited

Loading

zedi-pramodh commented Oct 23, 2023

eriknordmark Oct 24, 2023

eriknordmark Oct 24, 2023

andrewd-zededa Oct 24, 2023

eriknordmark left a comment

eriknordmark commented Oct 30, 2023

andrewd-zededa commented Oct 30, 2023

Disable k3s embedded containerd process, start k3s-containerd binary as external process #3478

Disable k3s embedded containerd process, start k3s-containerd binary as external process #3478

Conversation

andrewd-zededa commented Oct 3, 2023

deitch commented Oct 3, 2023

codecov bot commented Oct 3, 2023 • edited Loading

Codecov Report

zedi-pramodh commented Oct 3, 2023

deitch commented Oct 4, 2023

zedi-pramodh commented Oct 4, 2023

deitch commented Oct 4, 2023

zedi-pramodh commented Oct 4, 2023

deitch commented Oct 5, 2023

zedi-pramodh commented Oct 5, 2023

deitch commented Oct 5, 2023

zedi-pramodh commented Oct 5, 2023

deitch commented Oct 5, 2023

zedi-pramodh commented Oct 5, 2023

deitch commented Oct 6, 2023

zedi-pramodh commented Oct 6, 2023

andrewd-zededa commented Oct 6, 2023

andrewd-zededa commented Oct 6, 2023

deitch commented Oct 8, 2023

andrewd-zededa commented Oct 23, 2023 • edited Loading

zedi-pramodh commented Oct 23, 2023

eriknordmark Oct 24, 2023

Choose a reason for hiding this comment

eriknordmark Oct 24, 2023

Choose a reason for hiding this comment

andrewd-zededa Oct 24, 2023

Choose a reason for hiding this comment

eriknordmark left a comment

Choose a reason for hiding this comment

eriknordmark commented Oct 30, 2023

andrewd-zededa commented Oct 30, 2023

codecov bot commented Oct 3, 2023 •

edited

Loading

andrewd-zededa commented Oct 23, 2023 •

edited

Loading