Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert the vm.sh script into a go binary #1164

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Kuruyia
Copy link

@Kuruyia Kuruyia commented Mar 31, 2024

What this PR does / why we need it:
This adds a new Go CLI based on Cobra that will replace the vm.sh script used when provisioning a cluster to set up the VMs that will contain the k8s cluster nodes. This is done in order to increase the maintainability of this script in the future.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1160

Special notes for your reviewer:
The CLI flags were kept as-is so this can be used as a drop-in replacement.

I've already started to split the functions into multiple files so it is (hopefully) already a bit cleaner than the current shell script.

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note:

NONE

@kubevirt-bot kubevirt-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Mar 31, 2024
@kubevirt-bot kubevirt-bot requested a review from ormergi March 31, 2024 11:50
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign awels for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot requested a review from vladikr March 31, 2024 11:50
@kubevirt-bot
Copy link
Contributor

Hi @Kuruyia. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

)

// Wrapper around the output of the "qemu-img info" command
type QemuImgInfo struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to export this type since it is used only locally

// Creates a virtual disk image
func CreateDisk(path string, format string, size uint64) error {
cmd := exec.Command("qemu-img", "create", "-f", format, path, strconv.FormatUint(size, 10))
cmd.Stderr = os.Stderr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not also the StdOut?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know if we wanted to show the output of those commands, but in hindsight more verbosity can't hurt.

)

// The name of the QEMU main executable
const qemuExec = "qemu-system-%s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is correct for all OSes, in Centos Stream it should be qemu-kvm. I'm wondering how this can work with the current script, maybe there is a symlink

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script is actually running in a Fedora 39 container, so this is why it currently works.

Do we want to add a check for the current OS and use qemu-kvm if available instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I think we should get rid of fedora in the future. It is up to you, either you can a put a comment with a TODO and we can address this in a second PR or you can add it here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to address this in a second PR, so this one doesn't get too big

Copy link
Member

@alicefr alicefr Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to get rid of Fedora in this context?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to. But we have converted everything to centos stream. It seems a bit inconsistent to me that we still keep fedora. Aynway, this can be a topic for another PR.


// Writes the SSH script to connect to the guest
func writeSshFiles(nodeNum int) error {
err := os.WriteFile(sshScriptPath, []byte(fmt.Sprintf(sshScriptContents, nodeNum)), 744)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please inline this function with the error check

return err
}

err = os.WriteFile(sshReadyPath, []byte("done\n"), 0644)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same here

Comment on lines 247 to 288
_, err = os.Stat(provisionedDiskPath)

if err == nil {
os.Remove(firstDiskPath)
os.Symlink(provisionedDiskPath, firstDiskPath)
}

diskFile, diskBackingFile, err := utils.CalcNextDisk(".", nextDiskFlag)
if err != nil {
return fmt.Errorf("Failed to calculate the next disk image path: %v", err)
}

diskSize, err := qemu.GetDiskSize(diskBackingFile)
if err != nil {
return fmt.Errorf("Failed to get the disk image size of \"%s\": %v", diskBackingFile, err)
}

if diskSize < defaultDiskSize {
diskSize = defaultDiskSize
}

fmt.Printf("Creating disk \"%s backed by %s with size %d\"\n", diskFile, diskBackingFile, diskSize)
err = qemu.CreateDiskWithBackingFile(diskFile, "qcow2", diskSize, diskBackingFile, "qcow2")
if err != nil {
return err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be encapsulate into a separate function

@alicefr
Copy link
Member

alicefr commented Apr 5, 2024

@Kuruyia thanks for the work! Can you implement a couple of unit tests for the functions?

@alicefr
Copy link
Member

alicefr commented Apr 5, 2024

@dhiller @brianmcarey do you have any idea how we could test this?

if rootless {
for _, port := range ports {
// Add DNAT rule for rootless podman (traffic originating from loopback adapter)
toDestinationFlag := fmt.Sprintf("192.168.66.101:%d", port)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you used the same constant value in mulitple places, I would define a const for it

@Kuruyia
Copy link
Author

Kuruyia commented Apr 7, 2024

@Kuruyia thanks for the work! Can you implement a couple of unit tests for the functions?

Hey, thanks for your review!

I'll address your change requests and add the unit tests this week. I guess we're only testing the functions that don't have any side effects for now?

It'll be more involved to test functions that execute qemu-img, try to create a /dev/kvm device, ...

@alicefr
Copy link
Member

alicefr commented Apr 8, 2024

@Kuruyia thanks for the work! Can you implement a couple of unit tests for the functions?

Hey, thanks for your review!

I'll address your change requests and add the unit tests this week. I guess we're only testing the functions that don't have any side effects for now?

It'll be more involved to test functions that execute qemu-img, try to create a /dev/kvm device, ...

Since we didn't have any tests for the bash script, I would just add some simple ones.
For example, some that check the qemu command line. The rest is a bit difficult because this run multiple binaries, for example, the part that creates the the iptables, I would skip it for now.

Another place where you can add unit tests is CalcNextDisk. In this case, you could for example declare type and a function type and overwrite it in the unit tests for files, err := os.ReadDir(searchDir). Something like

type readDirFunc func(n int) ([]os.DirEntry, error)
type DiskUtil struct {
  readDirFunc readDir
}

func NewDiskUtil() *DiskUtil {
  return &DiskUtil{
      readDir = os.ReadDir
  }
}

and in the tests something like this:

type mockDirEntry struct {
   name string
}

func newmockDirEntry(name string) os.DirEntry {
   return &mockDirEntry{name:name}
}

func (m *mockDirEntry) Name() string {
  return m.name
}
[...] // implement the rest of the function for DirEntry

func mockReadDir(n int) ([]os.DirEntry, error) {
    return os.DirEntry{
       newmockDirEntry("disk0.qcow2"),
       newmockDirEntry("disk1.qcow2"),
    }, nil
}

diskUtil := &DiskUtil{
      readDir = mockReadDir
  }
// test
diskUtil.CalcNextDisk(...)

@kubevirt-bot kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Apr 14, 2024
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Apr 14, 2024
@dhiller
Copy link
Contributor

dhiller commented Apr 16, 2024

@dhiller @brianmcarey do you have any idea how we could test this?

@Kuruyia thank you for your work here!

@alicefr I agree with what you suggested. Since we just didn't (or rather couldn't) unit-test anything here, let's try to increase coverage of testing where we can - anything is better than nothing, as it is currently.

@lyarwood
Copy link
Member

/cc

FWIW #1174 is about to merge and will need to be covered here. I wasn't aware of this PR prior to starting this work sorry otherwise I would've extended it myself.

@kubevirt-bot kubevirt-bot requested a review from lyarwood April 19, 2024 14:23
@Kuruyia
Copy link
Author

Kuruyia commented Apr 19, 2024

/cc

FWIW #1174 is about to merge and will need to be covered here. I wasn't aware of this PR prior to starting this work sorry otherwise I would've extended it myself.

No problem, I was working on the unit tests today so I'll work on adding your modifications as well afterwards, thanks for letting me know.

@kubevirt-bot kubevirt-bot added dco-signoff: no Indicates the PR's author has not DCO signed all their commits. size/XXL and removed dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/XL labels Apr 20, 2024
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. and removed dco-signoff: no Indicates the PR's author has not DCO signed all their commits. labels Apr 20, 2024
@alicefr
Copy link
Member

alicefr commented Jun 26, 2024

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2024
@brianmcarey
Copy link
Member

/test ?

@kubevirt-bot
Copy link
Contributor

@brianmcarey: The following commands are available to trigger required jobs:

  • /test check-provision-alpine-with-test-tooling
  • /test check-provision-centos-base
  • /test check-provision-k8s-1.28
  • /test check-provision-k8s-1.29
  • /test check-provision-k8s-1.30
  • /test check-provision-manager
  • /test check-up-kind-ovn
  • /test check-up-kind-sriov

The following commands are available to trigger optional jobs:

  • /test check-gocli
  • /test check-up-kind-1.27-vgpu

Use /test all to run the following jobs that were automatically triggered:

  • check-provision-alpine-with-test-tooling
  • check-provision-k8s-1.28
  • check-provision-k8s-1.29
  • check-provision-k8s-1.30
  • check-provision-manager
  • check-up-kind-1.27-vgpu
  • check-up-kind-sriov

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@brianmcarey
Copy link
Member

/test check-gocli
/test check-provision-centos-base

@kubevirt-bot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 28, 2024
@Kuruyia
Copy link
Author

Kuruyia commented Jun 28, 2024

/retest

@kubevirt-bot
Copy link
Contributor

@Kuruyia: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@brianmcarey
Copy link
Member

/test check-gocli
/test check-provision-centos-base

@brianmcarey
Copy link
Member

/ok-to-test

@Kuruyia
Copy link
Author

Kuruyia commented Jul 1, 2024

/retest

@kubevirt-bot
Copy link
Contributor

@Kuruyia: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@brianmcarey
Copy link
Member

/test check-gocli
/test check-provision-centos-base

@brianmcarey
Copy link
Member

/ok-to-test

@Kuruyia
Copy link
Author

Kuruyia commented Aug 15, 2024

I fixed the name of the base golang image, and tested cluster provisioning under Fedora 40 with Podman. It built successfully, so hopefully this time's the charm for the CI as well :)

/retest

@kubevirt-bot
Copy link
Contributor

@Kuruyia: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

I fixed the name of the base golang image, and tested cluster provisioning under Fedora 40 with Podman. It built successfully, so hopefully this time's the charm for the CI as well :)

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@alicefr
Copy link
Member

alicefr commented Aug 16, 2024

/retest

@alicefr
Copy link
Member

alicefr commented Aug 16, 2024

hi @Kuruyia thanks for the persistence. I'm in favor of this change.
Kind reminder to @xpivarc @brianmcarey

@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Aug 16, 2024

@Kuruyia: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
check-provision-centos-base 55c5735 link true /test check-provision-centos-base
check-up-kind-1.30-vgpu 0d8d7fa link false /test check-up-kind-1.30-vgpu
check-up-kind-1.27-vgpu 0d8d7fa link false /test check-up-kind-1.27-vgpu

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Kuruyia
Copy link
Author

Kuruyia commented Aug 16, 2024

hi @Kuruyia thanks for the persistence. I'm in favor of this change. Kind reminder to @xpivarc @brianmcarey

No problem.

It looks like the check-up-kind-1.*-vgpu tests failed, but looking at those test histories, they just tend to do that?
I see that the required check-provision-* tests also didn't run on the last commit (0d8d7fa). Could you just rerun those?

@oshoval
Copy link
Contributor

oshoval commented Aug 16, 2024

Thanks for your contribution.

It looks like the check-up-kind-1.*-vgpu tests failed, but looking at those test histories, they just tend to do that? I see that the required check-provision-* tests also didn't run on the last commit (0d8d7fa). Could you just rerun those?

Centos9 folder which is the only changed folder doesnt affect kind-X providers.
Jobs are running, here for example https://prow.ci.kubevirt.io/view/gs/kubevirt-prow/pr-logs/pull/kubevirt_kubevirtci/1164/check-provision-k8s-1.28/1824330643983568896
Prow / Tide takes care about it.

Please consider that check-provision uses podman, but the post submit itself uses docker, worth to check it as well,
maybe for example by creating a draft PR that use docker to run presubmits and run rehearse there.

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 8, 2024
@kubevirt-bot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert the vm.sh script into a go binary
7 participants