Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KV provider #1230

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

KV provider #1230

wants to merge 8 commits into from

Conversation

aerosouund
Copy link
Member

What this PR does / why we need it:

Create the KubevirtProvider type and make it the main entity for cluster actions

Part of: #1110

Special Notes
This branch is based off opts. PR: #1217
it introduces the same commits but after opts gets merged it will only introduce commits of the kubevirt provider

Checklist

Release note:

Create kubevirt provider, move run and provision logic to it

@kubevirt-bot kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Jul 23, 2024
@kubevirt-bot
Copy link
Contributor

Hi @aerosouund. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign awels for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aerosouund
Copy link
Member Author

@aerosouund aerosouund changed the title Kv provider KV provider Jul 28, 2024
@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 2, 2024
@aerosouund aerosouund mentioned this pull request Aug 14, 2024
8 tasks
@kubevirt-bot kubevirt-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 18, 2024
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2024
@dhiller
Copy link
Contributor

dhiller commented Aug 26, 2024

/test all

@aerosouund
Copy link
Member Author

aerosouund commented Aug 26, 2024

I think a reason for this flake, the one where the CI job keeps waiting on the node to stop is that the node shuts down before reporting the status back to the ssh agent which makes it not know whether this happened or no

this code previously looked like this

_cmd(cli, nodeContainer(prefix, nodeName), "ssh.sh sudo shutdown now -h", "shutting down the node")

and now it looks like this (this PR introduces this change)

    if err = sshClient.Command("sudo shutdown now -h"); err != nil {
        return err
    }

which results in a golang error, since previously the error wasn't being handled

failed to execute command: sudo shutdown now -h

and so what i suggest we do is that we don't handle the error, i don't know if we also need to force the code to continue executing after a certain duration but this may not be necessary. since previously when ssh.sh would fail the gocli wouldn't get this failure and the CI run hangs. but since we now ssh directly its guaranteed that if there's an error you will get it

I will push a commit with this suggestion

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2024
@xpivarc
Copy link
Member

xpivarc commented Sep 11, 2024

I think a reason for this flake, the one where the CI job keeps waiting on the node to stop is that the node shuts down before reporting the status back to the ssh agent which makes it not know whether this happened or no

this code previously looked like this

_cmd(cli, nodeContainer(prefix, nodeName), "ssh.sh sudo shutdown now -h", "shutting down the node")

and now it looks like this (this PR introduces this change)

    if err = sshClient.Command("sudo shutdown now -h"); err != nil {
        return err
    }

which results in a golang error, since previously the error wasn't being handled

failed to execute command: sudo shutdown now -h

Is this from the _cmd?

and so what i suggest we do is that we don't handle the error, i don't know if we also need to force the code to continue executing after a certain duration but this may not be necessary. since previously when ssh.sh would fail the gocli wouldn't get this failure and the CI run hangs. but since we now ssh directly its guaranteed that if there's an error you will get it

I will push a commit with this suggestion

@aerosouund
Copy link
Member Author

aerosouund commented Sep 11, 2024

@xpivarc

Is this from the _cmd?

yes, i believe so. since when it was executed through ssh.sh it would timeout
and if executed and expected to handle the error it crashes

@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2024
@aerosouund
Copy link
Member Author

/hold

@kubevirt-bot kubevirt-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 26, 2024
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 5, 2024
@aerosouund
Copy link
Member Author

/test check-provision-k8s-1.29

@aerosouund
Copy link
Member Author

/test check-provision-k8s-1.31

@aerosouund
Copy link
Member Author

/test check-provision-k8s-1.29

@kubevirt-bot kubevirt-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 18, 2024
@kubevirt-bot kubevirt-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 25, 2024
@aerosouund aerosouund force-pushed the kv-provider branch 2 times, most recently from 361171e to 22465ce Compare October 30, 2024 13:58
Two new opts that represent the two scripts used in the provision phase (provision linux and provision k8s).
Using go embed to include any necessary config files then run the commands on a node using libssh

Signed-off-by: aerosouund <[email protected]>
The KubevirtProvider is a struct representing an arbitrary Kubevirtci running cluster.
It holds all config flags and options that are in the run and provision commands.
A Kubevirt provider can be created in two ways, by creating a cluster using the Start method, or from
an already running cluster.
For this to be possible then json representation of the struct is persisted on the dnsmasq container and later read to parse
the deployed settings
Or through the normal constructor which uses the option pattern to avoid a bloated function signature

The logic that was previously in run.go has been split to several methods to facilitate readability and testing (runNFSGanesha, runRegistry, prepareQemuCmd, prepareDeviceMappings)
and dnsmasq creation logic got moved to its own method instead of existing in its own package

Floating methods such as waitForVMToBeUp, nodeNameFromIndex, nodeContainer.. etc were grouped to be methods of the struct

Signed-off-by: aerosouund <[email protected]>
To avoid having to read each flag and return an error if its unset leverage the FlagMap, a map of flag name to FlagConfig.
a FlagConfig is the type of this flag (string, int, uint16, bool or array of string) and the option function that sets the value
of this flag on the KubevirtProvider struct.
During parsing of flags this map is being iterated on and each option gets appended to an array to later be used in the KubevirtProvider constructor.

The run method's role is now to parse the flags and pass them to the provider and just call Start.
All the floating methods in run.go are removed after being moved to the provider.

Signed-off-by: aerosouund <[email protected]>
This functionality now exists in the KubevirtProvider type and doesn't need a package of its own

Signed-off-by: aerosouund <[email protected]>
The KubevirtProvider type is what provides the methods that run a node or run the k8s options.
Testing logic has been moved to a Base Provider Suite

Signed-off-by: aerosouund <[email protected]>
…uint16

All references to ports in the codebase use uint not uint16.
There is no reason to keep the ports as they are

Signed-off-by: aerosouund <[email protected]>
@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Oct 31, 2024

@aerosouund: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
check-up-kind-1.27-vgpu 28752d9 link false /test check-up-kind-1.27-vgpu
check-provision-k8s-1.28 28752d9 link true /test check-provision-k8s-1.28

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@aerosouund
Copy link
Member Author

@brianmcarey @dhiller @xpivarc
The PR is finally ready, highly appreciate a review when you have time

@aerosouund aerosouund mentioned this pull request Nov 6, 2024
8 tasks
@kubevirt-bot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 8, 2024
@brianmcarey
Copy link
Member

/test check-provision-k8s-1.30

Copy link
Member

@brianmcarey brianmcarey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @aerosouund - just a couple of comments inline

/cc @xpivarc


operation := func() error {
client, err = ssh.Dial("tcp", net.JoinHostPort("127.0.0.1", fmt.Sprint(s.sshPort)), s.config)
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the more verbose error message here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, I can reintroduce it. I probably just removed it unintentionally

@@ -35,6 +37,18 @@ var _ = Describe("Node Provisioning", func() {
}

k8sClient = k8s.NewTestClient(reactors...)
kp = NewKubevirtProvider("k8s-1.30", "", &client.Client{}, []KubevirtProviderOption{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get k8s-1.30 dynamically from what folders are existing? This will fail when removing the k8s-1.30 provider?

Copy link
Member Author

@aerosouund aerosouund Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No its unrelated to the folder and off the top of my mind it won't fail if we drop 1.30 altogether
the version validation logic has been moved to the run command, and since this is being defined in line then it wont be obstructed by this logic

cluster-provision/gocli/cmd/run.go:143

	allowedK8sVersions := []string{"k8s-1.28", "k8s-1.29", "k8s-1.30", "k8s-1.31"}
	var validVersion bool
	for _, v := range allowedK8sVersions {
		if k8sVersion == v {
			validVersion = true
		}
	}
	if !validVersion {
		return fmt.Errorf("Invalid k8s version passed, please use one of k8s-1.28, k8s-1.29, k8s-1.30 or k8s-1.31")
	}

I don't think anywhere else in the code does this validation but I will double check

_, err = kp.Docker.ContainerCommit(ctx, node.ID, container.CommitOptions{
Reference: target,
Comment: "PROVISION SUCCEEDED",
Author: "gocli",
Copy link
Member

@brianmcarey brianmcarey Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the author be "The KubeVirt Authors"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can change it

@kubevirt-bot kubevirt-bot requested a review from xpivarc December 3, 2024 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants