Skip to content

Commit

Permalink
Add offload plugin
Browse files Browse the repository at this point in the history
Issue:
when migrating the VMs the VM disks need to be transfered over the network.
This is inefficient, takes additional storage and is slow. We need
an alternative way for the disk transfer.

Design:
This PR is PoC of adding an offload plugin for the disk transfer.
The offload plugin is a specific tool that needs to be implemented
by the CSI provider. To specify the storage offload plugin the user will
need to specify it in the StorageMap destination. This will allow the
users to migrate the VM even with disks across multiple data store types.
For example, some could be managed by the offloadPlugin and some would
still go over the network if needed.

Example of the storage map with offload plugin:
```
spec:
  map:
    - destination:
        offloadPlugin:
          image: 'quay.io/mnecas0/offload:latest'
          vars:
            test1: test1
            test2: test2
        storageClass: CSI
      source:
        id: datastore-30
...
```
The offload plugin is started right after the CreateDataVolumes, this
the way the Kubernetes CSI will create empty PVs into which the disks can be
transfered.

The OffloadPlugin step creates a job on the destination cluster. The job
provided offload plugin image and start. The job is started with the
following parameters:
- `HOST` = url to the esxi host
- `PLAN_NAME` = plane name
- `NAMESPACE` = namepsace where the migration is running

In addition to these variables, it also mounts the secrets to the vCenter.
The secrets are in the path `/etc/secret` and the files are:
- `accessKeyId` with a username
- `secretKey` with a password

Note:
This change additionally requires
kubev2v#1109, because right now the
cold migration transfer is managed by the virt-v2v. The kubev2v#1109 removes
this dependency and moves it out to the CNV CDI. This allows us to split
the transfer and conversion steps which were in the same step from the
forklift perspective. Once the disks are transferred the Forklift does the
`virt-v2v-in-place` on the disks and starts the VM.
In the same way, this step will be done also on the offload plugin as we
will transfer the disks using the offload plugin and then start
`virt-v2v-in-place` on the disks.

TODO:
[ ] Add design doc showing larger details, this is just PoC
[ ] Add check if OffloadPlugin image exists
[ ] Add check of the offload plugin disk transfer status
[ ] Allow storage map with OffloadPlugin and without combination
[ ] Improve the name of the offload pugin job, right now its the VM ID

Signed-off-by: Martin Necas <[email protected]>
  • Loading branch information
mnecas committed Oct 19, 2024
1 parent 75ac8a7 commit c2b3402
Show file tree
Hide file tree
Showing 13 changed files with 313 additions and 13 deletions.
15 changes: 15 additions & 0 deletions operator/config/crd/bases/forklift.konveyor.io_storagemaps.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,21 @@ spec:
- ReadWriteMany
- ReadOnlyMany
type: string
offloadPlugin:
description: Offload plugin.
properties:
image:
description: Offload plugin.
type: string
vars:
additionalProperties:
type: string
description: Offload plugin variables passed to the
job
type: object
required:
- image
type: object
storageClass:
description: A storage class.
type: string
Expand Down
8 changes: 8 additions & 0 deletions pkg/apis/forklift/v1beta1/mapping.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,21 @@ type StoragePair struct {
type DestinationStorage struct {
// A storage class.
StorageClass string `json:"storageClass"`
// Offload plugin.
OffloadPlugin *OffloadPlugin `json:"offloadPlugin,omitempty"`
// Volume mode.
// +kubebuilder:validation:Enum=Filesystem;Block
VolumeMode core.PersistentVolumeMode `json:"volumeMode,omitempty"`
// Access mode.
// +kubebuilder:validation:Enum=ReadWriteOnce;ReadWriteMany;ReadOnlyMany
AccessMode core.PersistentVolumeAccessMode `json:"accessMode,omitempty"`
}
type OffloadPlugin struct {
// Offload plugin.
Image string `json:"image"`
// Offload plugin variables passed to the job
Vars map[string]string `json:"vars,omitempty"`
}

// Network map spec.
type NetworkMapSpec struct {
Expand Down
9 changes: 9 additions & 0 deletions pkg/apis/forklift/v1beta1/plan.go
Original file line number Diff line number Diff line change
Expand Up @@ -118,3 +118,12 @@ func (r *Plan) IsSourceProviderOCP() bool {
func (r *Plan) IsSourceProviderOVA() bool {
return r.Provider.Source.Type() == Ova
}

func (r *Plan) HasOffloadPlugin() bool {
for _, storageMap := range r.Map.Storage.Spec.Map {
if storageMap.Destination.OffloadPlugin != nil {
return true
}
}
return false
}
1 change: 1 addition & 0 deletions pkg/controller/plan/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ go_library(
"hook.go",
"kubevirt.go",
"migration.go",
"offloadplugin.go",
"predicate.go",
"util.go",
"validation.go",
Expand Down
2 changes: 2 additions & 0 deletions pkg/controller/plan/adapter/base/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ type Builder interface {
LunPersistentVolumeClaims(vmRef ref.Ref) (pvcs []core.PersistentVolumeClaim, err error)
// check whether the builder supports Volume Populators
SupportsVolumePopulators() bool
// check whether the builder supports offload plugin
SupportsOffloadPlugin() bool
// Build populator volumes
PopulatorVolumes(vmRef ref.Ref, annotations map[string]string, secretName string) ([]*core.PersistentVolumeClaim, error)
// Transferred bytes
Expand Down
3 changes: 3 additions & 0 deletions pkg/controller/plan/adapter/ocp/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,9 @@ func pvcSourceName(namespace, name string) string {
func (r *Builder) SupportsVolumePopulators() bool {
return false
}
func (r *Builder) SupportsOffloadPlugin() bool {
return false
}

func (r *Builder) PopulatorVolumes(vmRef ref.Ref, annotations map[string]string, secretName string) (pvcs []*core.PersistentVolumeClaim, err error) {
err = planbase.VolumePopulatorNotSupportedError
Expand Down
3 changes: 3 additions & 0 deletions pkg/controller/plan/adapter/openstack/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -936,6 +936,9 @@ func (r *Builder) LunPersistentVolumeClaims(vmRef ref.Ref) (pvcs []core.Persiste
func (r *Builder) SupportsVolumePopulators() bool {
return true
}
func (r *Builder) SupportsOffloadPlugin() bool {
return false
}

func (r *Builder) PopulatorVolumes(vmRef ref.Ref, annotations map[string]string, secretName string) (pvcs []*core.PersistentVolumeClaim, err error) {
workload := &model.Workload{}
Expand Down
3 changes: 3 additions & 0 deletions pkg/controller/plan/adapter/ova/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -543,6 +543,9 @@ func (r *Builder) LunPersistentVolumeClaims(vmRef ref.Ref) (pvcs []core.Persiste
func (r *Builder) SupportsVolumePopulators() bool {
return false
}
func (r *Builder) SupportsOffloadPlugin() bool {
return false
}

func (r *Builder) PopulatorVolumes(vmRef ref.Ref, annotations map[string]string, secretName string) (pvcs []*core.PersistentVolumeClaim, err error) {
err = planbase.VolumePopulatorNotSupportedError
Expand Down
3 changes: 3 additions & 0 deletions pkg/controller/plan/adapter/ovirt/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -715,6 +715,9 @@ func (r *Builder) LunPersistentVolumeClaims(vmRef ref.Ref) (pvcs []core.Persiste
func (r *Builder) SupportsVolumePopulators() bool {
return !r.Context.Plan.Spec.Warm && r.Context.Plan.Provider.Destination.IsHost()
}
func (r *Builder) SupportsOffloadPlugin() bool {
return false
}

func (r *Builder) PopulatorVolumes(vmRef ref.Ref, annotations map[string]string, secretName string) (pvcs []*core.PersistentVolumeClaim, err error) {
workload := &model.Workload{}
Expand Down
29 changes: 20 additions & 9 deletions pkg/controller/plan/adapter/vsphere/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -435,15 +435,21 @@ func (r *Builder) DataVolumes(vmRef ref.Ref, secret *core.Secret, _ *core.Config
storageClass := mapped.Destination.StorageClass
var dvSource cdi.DataVolumeSource
// Let CDI do the copying
dvSource = cdi.DataVolumeSource{
VDDK: &cdi.DataVolumeSourceVDDK{
BackingFile: r.baseVolume(disk.File),
UUID: vm.UUID,
URL: url,
SecretRef: secret.Name,
Thumbprint: thumbprint,
InitImageURL: r.Source.Provider.Spec.Settings[api.VDDK],
},
if mapped.Destination.OffloadPlugin != nil {
dvSource = cdi.DataVolumeSource{
Blank: &cdi.DataVolumeBlankImage{},
}
} else {
dvSource = cdi.DataVolumeSource{
VDDK: &cdi.DataVolumeSourceVDDK{
BackingFile: r.baseVolume(disk.File),
UUID: vm.UUID,
URL: url,
SecretRef: secret.Name,
Thumbprint: thumbprint,
InitImageURL: r.Source.Provider.Spec.Settings[api.VDDK],
},
}
}
dvSpec := cdi.DataVolumeSpec{
Source: &dvSource,
Expand Down Expand Up @@ -747,6 +753,7 @@ func (r *Builder) Tasks(vmRef ref.Ref) (list []*plan.Task, err error) {
err = liberr.Wrap(err, "vm", vmRef.String())
return
}
// TODO: Filter offload plugin by the Datastore?
for _, disk := range vm.Disks {
mB := disk.Capacity / 0x100000
list = append(
Expand Down Expand Up @@ -954,6 +961,10 @@ func (r *Builder) SupportsVolumePopulators() bool {
return false
}

func (r *Builder) SupportsOffloadPlugin() bool {
return true
}

func (r *Builder) PopulatorVolumes(vmRef ref.Ref, annotations map[string]string, secretName string) (pvcs []*core.PersistentVolumeClaim, err error) {
err = planbase.VolumePopulatorNotSupportedError
return
Expand Down
53 changes: 49 additions & 4 deletions pkg/controller/plan/migration.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ var (
CDIDiskCopy libitr.Flag = 0x08
OvaImageMigration libitr.Flag = 0x10
OpenstackImageMigration libitr.Flag = 0x20
HasOffloadPlugin libitr.Flag = 0x40
)

// Phases.
Expand All @@ -59,6 +60,7 @@ const (
PowerOffSource = "PowerOffSource"
WaitForPowerOff = "WaitForPowerOff"
CreateDataVolumes = "CreateDataVolumes"
OffloadPlugin = "OffloadPlugin"
CreateVM = "CreateVM"
CopyDisks = "CopyDisks"
CopyingPaused = "CopyingPaused"
Expand Down Expand Up @@ -105,6 +107,7 @@ var (
{Name: PowerOffSource},
{Name: WaitForPowerOff},
{Name: CreateDataVolumes},
{Name: OffloadPlugin, All: HasOffloadPlugin},
{Name: CopyDisks, All: CDIDiskCopy},
{Name: CreateGuestConversionPod, All: RequiresConversion},
{Name: ConvertGuest, All: RequiresConversion},
Expand Down Expand Up @@ -640,7 +643,7 @@ func (r *Migration) step(vm *plan.VMStatus) (step string) {
switch vm.Phase {
case Started, CreateInitialSnapshot, WaitForInitialSnapshot, CreateDataVolumes:
step = Initialize
case CopyDisks, CopyingPaused, CreateSnapshot, WaitForSnapshot, AddCheckpoint, ConvertOpenstackSnapshot:
case CopyDisks, CopyingPaused, CreateSnapshot, WaitForSnapshot, AddCheckpoint, ConvertOpenstackSnapshot, OffloadPlugin:
step = DiskTransfer
case CreateFinalSnapshot, WaitForFinalSnapshot, AddFinalCheckpoint, Finalize:
step = Cutover
Expand Down Expand Up @@ -740,6 +743,44 @@ func (r *Migration) execute(vm *plan.VMStatus) (err error) {
} else {
vm.Phase = Completed
}
case OffloadPlugin:
step, found := vm.FindStep(r.step(vm))
if !found {
vm.AddError(fmt.Sprintf("Step '%s' not found", r.step(vm)))
break
}
step.MarkStarted()
step.Phase = Running
runner := OffloadPluginRunner{
Context: r.Context,
kubevirt: r.kubevirt,
}
var job *batchv1.Job
job, err = runner.Run(vm)
if err != nil {
r.Log.Error(err, "Starting the offload plugin", "vm", vm.Name)
step.AddError(err.Error())
err = nil
return
}
conditions := libcnd.Conditions{}
for _, cnd := range job.Status.Conditions {
conditions.SetCondition(libcnd.Condition{
Type: string(cnd.Type),
Status: string(cnd.Status),
Reason: cnd.Reason,
Message: cnd.Message,
})
}
if conditions.HasCondition("Failed") {
step.AddError(conditions.FindCondition("Failed").Message)
step.MarkCompleted()
} else if int(job.Status.Failed) > retry {
step.AddError("Retry limit exceeded.")
step.MarkCompleted()
} else if job.Status.Succeeded > 0 {
vm.Phase = r.next(vm.Phase)
}
case CreateDataVolumes:
step, found := vm.FindStep(r.step(vm))
if !found {
Expand Down Expand Up @@ -885,7 +926,9 @@ func (r *Migration) execute(vm *plan.VMStatus) (err error) {
}
step.MarkStarted()
step.Phase = Running

if r.builder.SupportsOffloadPlugin() {
// TODO: Add progress of the offload plugin transfer
}
if r.builder.SupportsVolumePopulators() {
err = r.updatePopulatorCopyProgress(vm, step)
} else {
Expand Down Expand Up @@ -1848,12 +1891,14 @@ func (r *Predicate) Evaluate(flag libitr.Flag) (allowed bool, err error) {
_, allowed = r.vm.FindHook(PostHook)
case RequiresConversion:
allowed = r.context.Source.Provider.RequiresConversion()
case OvaImageMigration:
allowed = r.context.Plan.IsSourceProviderOVA()
case CDIDiskCopy:
allowed = !r.context.Plan.IsSourceProviderOVA()
case OvaImageMigration:
allowed = r.context.Plan.IsSourceProviderOVA()
case OpenstackImageMigration:
allowed = r.context.Plan.IsSourceProviderOpenstack()
case HasOffloadPlugin:
allowed = r.context.Plan.HasOffloadPlugin()
}

return
Expand Down
Loading

0 comments on commit c2b3402

Please sign in to comment.