-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFE] Adding CI checks for toolbox in podman #10296
Comments
We do have a number of CI checks added by the Toolbox team previously, but I don't believe any of them use Toolbox itself, but instead test the ways Toolbox uses Podman. We are definitely not opposed to adding to this suite. I think @baude was opposed to having Toolbox tests gate Podman changes, but I could be misremembering here. |
Yes, we would be fine with gating our release, but would be reluctant on every PR, unless they are specific to Podman, like we currently have. This issue would be making sure toolbox breakage did not block Podman PRs being merged, which we have seen in CI systems in the past. |
Agree that doing this for each PR would probably be too much & blocking on toolbox bugs for podman. Once just before release to catch latent bugs would be great. |
I just want to reply for the record given @mheon comment. This has been an ongoing discussion with the TB team. We actually were not opposed to running their tests in our CI for every PR, but in the same breath, we also were not going to have the tests be blockers. I think most podman contributors who saw a busted test would give it a look-see and see about fixing it, but it wouldn't be required. I think that offer is still quite reasonable. The TB team was supposed to provide us with their tests so that we could create a CI instance and run it. That offer is still open. Ideally they would integrate it themselves with a PR but we could also assist happily. One last item we also agreed on as well. If we cannot figure out why a test is failing AND it lingers due to something like unavailibility, we would temporarily take the tests offline because we pay for our CI resources. Bottom line for me ... lets get the tests running in our CI. If someone outside our team can do the work, then it can be done as soon as the tests are ready. If not, I'll prioritize amongst all the things we have on our plate (as in, it will get done as we catch up). Thanks for taking this up! |
Let me translate how 'we don't want this to be a blocker' reads from the other side: We like to continue breaking your stuff. That is very disappointing. |
I agree with @matthiasclasen. I've been using Silverblue for a while now (almost 2 years) and it's generally fine, except for numerous Toolbox issues. Almost all of the Toolbox issues have come from Podman breaking things in version bumps. Like most others using Silverblue, having Toolbox and/or Podman not working prevents me from actually doing work until I fix it, as everything I work on is inside of a Toolbox container for development. (Generally, the fix is to roll back to the previous deployment of Silverblue, apply a hotfix from bodhi as an override replace, and/or nuke all my containers and rebuild my dev environment — which I thankfully have scripted, but others don't.) Some of these have happened in beta, but a worrying amount of them have hit released versions. There definitely needs to be some CI somewhere that would prevent a broken podman+toolbox combo from making its way into Silverblue. (Ideally, it would not make it into normal Fedora Workstation/Server either, as people do rely on toolbox and podman working there as well, but it's usually not quite as critical as it is in Silverblue.) It makes sense to try to catch these issues as early as possible, so having a "does it break toolbox?" test would be beneficial. (We've also hit a handful of Podman stability issues in Cockpit-Podman as well, but that's a different, but somewhat related topic.) |
The consensus was:
I think an important aspect here is that the latest regression involving the (There are tests in Podman that test upgrades, and in fact there was a test that could have caught it, but it was disabled because it wasn't obvious how to get it to work. Looks like that's getting fixed now.) In Toolbox, so far, we have relied upon manual code review and testing to protect backwards compatibility. I think that we should devote a bit more attention to have better automated tests for upgrades in the future. |
Normally we work with the Toolbox team to ensure that each Podman release
is tested and working before it reaches stable. This release was an early
RC that slipped in through a series of errors. The normal process should
have caught this one.
This doesn’t mean we shouldn’t improve gating and upstream CI, but this RC
slipping out was an anomaly and I don’t expect it to happen again - the
process will improve to prevent it.
…On Wed, May 12, 2021 at 10:46 Debarshi Ray ***@***.***> wrote:
The consensus was:
1. We will add unit tests to Podman's own upstream test suite that
mimics the ways in which Toolbox uses Podman. These would get run on each
pull request, be blocking, etc.. We have done that over time, and we need
to keep adding more as the code evolves and bugs are found, etc..
2. Get Toolbox's upstream test suite run as part of Podman's Fedora
gating. This is the *get Toolbox tests added to Podman* part.
@edsantiago <https://github.com/edsantiago> was helping @HarryMichal
<https://github.com/HarryMichal> with this, but I think it got stuck
somewhere in the complexities of Fedora gating.
I think an important aspect here is that the latest regression
<#10274> involving the PidFile
is about backwards compatibility with older Podman releases. This is
generally a more difficult thing to test against, and something that
affects Toolbox a lot because of the containers being treated as *pets*.
(There are tests in Podman that test upgrades, and in fact there was a
test that could have caught it, but it was disabled because it wasn't
obvious how to get it to work. Looks
<#10288> like that's getting
fixed now.)
In Toolbox, so far, we have relied upon manual code review and testing to
protect backwards compatibility. I think that we should devote a bit more
attention to have better automated tests for upgrades in the future.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10296 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3AOCF4EB3OWTGE5FAKSFTTNKIDBANCNFSM44RT6J4Q>
.
|
@debarshiray you are exactly correct (and I stand corrected), it was gating tests that we targeted. And shy of the anomaly as @mheon describes, that shouild sufficiently satisfy both parties. Thank you for the clarification. And! The gating IS blocking, so that should satisfy folks here. |
I've dropped the ball on this too. I was waiting for my requirements. My requirements are:
As of this writing, # bats /usr/share/toolbox/test/system/
bats: libs/bats-support/load does not exist [message repeated a total of 7 times]
✗
(from function `source' in file /usr/share/toolbox/test/system/001-version.bats, line 3,
from function `bats_run_setup_file' in file /usr/libexec/bats-core/bats-exec-file, line 87,
from function `main' in test file /usr/libexec/bats-core/bats-exec-file, line 258)
`bats_run_setup_file' failed
... This is beyond my ability to fix. Once it is fixed, please let me know and I will work to hook toolbox-tests into podman fedora gating. Thanks! |
I just landed a fix for this. I'll discuss with @debarshiray making a new release with the fix included. Still, I know the tests will not work locally, since in the tests we rely on some functionality that is backed by the exact testing environment (Zuul). I'll have to amend that to make the tests executable on other systems. |
The tests should be fairly easy to run now since I merged containers/toolbox#774. I used the PR extensively on a FCOS host. So, after this I'll just update the test README and the |
A friendly reminder that this issue had no activity for 30 days. |
@debarshiray @garrett What is going on with this? |
Toolbox has a new release (v0.0.99.2) that fixes the tool-related problems @edsantiago mentioned. The tests themselves are completely functional but still have a few downsides:
The rpm is not ready due to missing |
That's sort of a big hassle. I don't think podman gating tests should be in the business of managing and installing non-rpm content. At this point I consider this a showstopper. I just installed |
Agreed.
Right. This is related to the fact that the two libraries are not packaged, yet. I remember you trying to run the tests some time ago and bumping into an issue with your current folder not being inside of the container causing the command to error out. That should be fixed now. |
@HarryMichal it looks like bats-support is licensed as Creative Commons, could you just include the necessary files in your |
Aaah, that should work. I'll do it that way then. |
@edsantiago We've got a build in Fedora Rawhide that should contain all the pieces to make the system tests runnable. The build is already in stable so either wait for it to become available to you or get it from koji. The build is based on a snapshot of main, so any changes can be easily made. |
I'm not seeing that. I see .0.99.2-1.fc35 (from June) in stable. I downloaded the koji build you cited, but can't install:
|
By stable I meant available via |
No, I was on Rawhide. I'm pretty sure it's not in stable nor updates-testing. |
Well, the build is still very fresh. I'm merely reporting what Bodhi says. Curious about the glibc requirement, though. Will test it out. |
@edsantiago We have fixed the prroblems in the test suite and I've sent a new build to rawhide. https://koji.fedoraproject.org/koji/taskinfo?taskID=73573768 I suppose it will be available in the next compose. Could you please check it works for you? |
tests pass as user |
I have the gating-test fixes necessary to run toolbox-tests as part of podman gating tests in rawhide. They pass using @containers/podman-maintainers where do we want to go with this? Are we really OK with running these tests as part of our rawhide gating? |
the diffs. I am not pushing to fedpkg, nor even submitting as a PR, because (1) the |
@olivergs @HarryMichal I believe we have buy-in to enable this in podman. I just did a final review, and hit a showstopper: as best I can see, you do not have a |
Hi @edsantiago. Ok. I'll work on that during the week and tell you when ready. Thanks! |
Hi @olivergs, how goes your effort to set up toolbox gating tests? Also, are you replicating these tests on f33 and f34? We have a podman release coming soon, and would really welcome feedback from the toolbox team. It's probably too late to integrate gating tests for this release, but if you have a way of running your tests on our bodhi (f33, f34) we would be most grateful. |
Hi Ed. As we talked about this yesterday, I'm giving the finishing touches to this. The tests are running but them fail. I'm currently trying to get the artifacts for the test to know what is failing. I will give more feedback when this is finished. |
Hi again. The gating in fedora is already working for fedora 35. Rawhide fails because there is not a f36 toolbox image yet due to this bug: https://pagure.io/fedora-infrastructure/issue/10145 and f34 needs a backport of containers/toolbox#857 . @edsantiago You should be able to check it in f35 right now. On the other hand, I tested latest podman locally from this build: https://koji.fedoraproject.org/koji/taskinfo?taskID=74625958 and executed the test suite with the latest upstream for toolbox. It passes without major incidents and the usual workflow for toolbox seems ok. |
Submitted PR for f35. Confirmed that it works in 1minutetip. Also confirmed hat you have an f35 bodhi and that tests pass. |
That PR is now merged. Toolbox gating tests are now enabled for podman on f35. But @containers/podman-maintainers I think we have a problem coming up. For any given Fedora NN, toolbox seems to want to pull a I kind of don't think this is acceptable. I would like to ask the toolbox team to make their process more robust, perhaps to pre-tag an |
Toolbox now has support for basic config files that we can use to default to a working image on Rawhide and branched Fedora. We of course would need to bump this config from time to time but this rather easy to do. |
And thanks for the work here! |
Building a new image is a matter of doing a
Umm... honest question: what makes you think that it will take months? I do see activity on the Fedora infrastructure ticket.
I am not sure what the Toolbox developers can do if Fedora's OCI infrastructure is broken, other than talking to the Fedora infra' folks and trying to get it prioritized. Given that this is free software and community and all that, anybody else can help convey that message too. :) |
The issue with the Fedora infrastructure got fixed, and we have new image builds in place now. |
Thanks! I see gating tests failing. Please update here when those are fixed, we will then look into enabling this on podman rawhide. |
@edsantiago The image seems to be in stage yet. As soon as it reaches main regisitry we should be able to check. On the other hand, the infra bug reappeared. I've reopened the issue and asked for fixing it again. |
fedora-toolbox:36 should be in the stable fedora registry now. |
Thanks @juhp !. Yes. It is in stable and after re-running the CI, the gating passes. https://osci-jenkins-1.ci.fedoraproject.org/job/fedora-ci/job/dist-git-pipeline/job/master/72523/ |
The infra bug seemed to be just a node that did not updated correctly. Now seems everything works as expected. |
Do you have a link to a working bodhi? All I see is toolbox-0.0.99.2^3.git075b9a8d2779-6.fc36, which fails. I see the jenkins link you posted, but I don't see that actually being reflected in bodhi. |
I've rerun the build because the last time I just run the CI work. Here you have a bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-2021-306c96fe51 |
Thank you! I've submitted a PR for rawhide. |
Thank you @edsantiago ! |
Toolbox gating tests are now running and passing in rawhide and in f35. Thank you everyone for your perseverance and patience. |
Yay! That's wonderful to hear. Thanks, @edsantiago |
/kind feature
Description
containers/toolbox is a popular (in Fedora CoreOS & Fedora Silverblue) downstream user of podman and has been broken several times by podman updates. We have CI checks now in Fedora CoreOS and soon in toolbox itself but it would be even better if you could run the toolbox test suite on podman at least once before releases or directly in CI.
I'll be setting up toolbox CI on Fedora CoreOS soon so feel free to reach out.
Non-exhaustive list of issues:
#10274
#10288
coreos/fedora-coreos-tracker#734
The text was updated successfully, but these errors were encountered: