Skip to content

Commit

Permalink
Add kernel-64k extension
Browse files Browse the repository at this point in the history
Since RHEL 9 reverted to 4K memory pages for aarch64, add a way to
switch to a hugepage kernel.
The MachineConfig should contain the following to trigger the kernel
switch:
spec:
  kernelType: 64k-pages

This is exclusive with the `realtime` kernel option.

xref https://issues.redhat.com/browse/COS-2402
This requires openshift/os#1351

Signed-off-by: jbtrystram <[email protected]>
  • Loading branch information
jbtrystram committed Oct 20, 2023
1 parent 3ea0806 commit ca0f407
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 24 deletions.
10 changes: 8 additions & 2 deletions docs/MachineConfiguration.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ When a machine boots with `nosmt` Kernel Argument, it disables multi-threading o
### KernelType

This feature is available with OCP 4.4 and onward releases as both `day 1` and `day 2` operation. It allows to choose between traditional and Real Time (RT) kernel on an RHCOS node. Supported values are
`""` or `default` for traditional kernel and `realtime` for RT kernel.
`""` or `default` for traditional kernel, `realtime` for RT kernel and `64k-pages` for 64k memory pages on aarch64.

To set kernelType field during cluster install, see the [installer guide](https://github.com/openshift/installer/blob/master/docs/user/customization.md#Switching-RHCOS-host-kernel-using-KernelType).

Expand All @@ -201,7 +201,13 @@ spec:
**Note:** The RT kernel lowers throughput (performance) in return for improved worst-case latency bounds. This feature is intended only for use cases that require consistent low latency. For more information, see the [Linux Foundation wiki](https://wiki.linuxfoundation.org/realtime/start) and the [RHEL RT portal](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/8/).

### RHCOS Extensions
RHCOS is a minimal OCP focused OS which provides capabilities common across all the platforms. With extensions support, OCP 4.6 and onward users can enable a limited set of additional functionality on the RHCOS nodes. In OCP 4.6 the supported extensions is `usbguard`. In OCP 4.8 the supported extensions are `usbguard` and `sandboxed-containers`. In OCP 4.11 the supported extensions are `usbguard`, `sandboxed-containers`, and `kerberos`. In OCP 4.14 the supported extensions are `usbguard`, `sandboxed-containers`, `kerberos`, `ipsec` and `wasm`.
RHCOS is a minimal OCP focused OS which provides capabilities common across all the platforms. With extensions support, OCP 4.6 and onward users can enable a limited set of additional functionality on the RHCOS nodes.
| OCP version | Supported extensions |
| ------------- | ---------------------------- |
| 4.6 | `usbguard` |
| 4.8 | `usbguard`, `sandboxed-containers` |
| 4.11 | `usbguard`, `sandboxed-containers`, `kerberos` |
| 4.14 | `usbguard`, `sandboxed-containers`, `kerberos`, `ipsec`, `wasm` |

Extensions can be installed by creating a MachineConfig object. Extensions can be enabled as both day1 and day2. Check [installer guide](https://github.com/openshift/installer/blob/master/docs/user/customization.md#Enabling-RHCOS-Extensions) to enable extensions during cluster install.

Expand Down
3 changes: 3 additions & 0 deletions pkg/controller/common/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ const (
// KernelTypeRealtime denominates the realtime kernel type
KernelTypeRealtime = "realtime"

// KernelType64kPages denominates the 64k pages kernel
KernelType64kPages = "64k-pages"

// MasterLabel defines the label associated with master node. The master taint uses the same label as taint's key
MasterLabel = "node-role.kubernetes.io/master"

Expand Down
8 changes: 4 additions & 4 deletions pkg/controller/common/helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,17 +120,17 @@ func MergeMachineConfigs(configs []*mcfgv1.MachineConfig, cconfig *mcfgv1.Contro
return nil, err
}

// Setting FIPS to true or kerneType to realtime in any MachineConfig takes priority in setting that field
// Setting FIPS to true or kernelType to a non-default value in any MachineConfig takes priority in setting that field
for _, cfg := range configs {
if cfg.Spec.FIPS {
fips = true
}
if cfg.Spec.KernelType == KernelTypeRealtime {
if cfg.Spec.KernelType == KernelTypeRealtime || cfg.Spec.KernelType == KernelType64kPages {
kernelType = cfg.Spec.KernelType
}
}

// If no MC sets kerneType, then set it to 'default' since that's what it is using
// If no MC sets kernelType, then set it to 'default' since that's what it is using
if kernelType == "" {
kernelType = KernelTypeDefault
}
Expand Down Expand Up @@ -569,7 +569,7 @@ func InSlice(elem string, slice []string) bool {

// ValidateMachineConfig validates that given MachineConfig Spec is valid.
func ValidateMachineConfig(cfg mcfgv1.MachineConfigSpec) error {
if !(cfg.KernelType == "" || cfg.KernelType == KernelTypeDefault || cfg.KernelType == KernelTypeRealtime) {
if !(cfg.KernelType == "" || cfg.KernelType == KernelTypeDefault || cfg.KernelType == KernelTypeRealtime || cfg.KernelType == KernelType64kPages) {
return fmt.Errorf("kernelType=%s is invalid", cfg.KernelType)
}

Expand Down
49 changes: 31 additions & 18 deletions pkg/daemon/update.go
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,7 @@ func (dn *CoreOSDaemon) applyOSChanges(mcDiff machineConfigDiff, oldConfig, newC
}
}

// Only check the image type and excute OS changes if:
// Only check the image type and execute OS changes if:
// - machineconfig changed
// - we're staying on a realtime kernel ( need to run rpm-ostree update )
// - we have extensions ( need to run rpm-ostree update )
Expand All @@ -282,7 +282,9 @@ func (dn *CoreOSDaemon) applyOSChanges(mcDiff machineConfigDiff, oldConfig, newC
// if they were in use, so we also need to preserve that behavior.
// https://issues.redhat.com/browse/OCPBUGS-4049
if mcDiff.osUpdate || mcDiff.extensions || mcDiff.kernelType || mcDiff.kargs ||
canonicalizeKernelType(newConfig.Spec.KernelType) == ctrlcommon.KernelTypeRealtime || len(newConfig.Spec.Extensions) > 0 {
canonicalizeKernelType(newConfig.Spec.KernelType) == ctrlcommon.KernelTypeRealtime ||
canonicalizeKernelType(newConfig.Spec.KernelType) == ctrlcommon.KernelType64kPages ||
len(newConfig.Spec.Extensions) > 0 {

// Throw started/staged events only if there is any update required for the OS
if dn.nodeWriter != nil {
Expand Down Expand Up @@ -722,6 +724,8 @@ func (mcDiff *machineConfigDiff) osChangesString() string {
func canonicalizeKernelType(kernelType string) string {
if kernelType == ctrlcommon.KernelTypeRealtime {
return ctrlcommon.KernelTypeRealtime
} else if kernelType == ctrlcommon.KernelType64kPages {
return ctrlcommon.KernelType64kPages
}
return ctrlcommon.KernelTypeDefault
}
Expand Down Expand Up @@ -1127,7 +1131,7 @@ func (dn *CoreOSDaemon) applyExtensions(oldConfig, newConfig *mcfgv1.MachineConf
}

// switchKernel updates kernel on host with the kernelType specified in MachineConfig.
// Right now it supports default (traditional) and realtime kernel
// Right now it supports default (traditional), realtime kernel and 64k pages kernel
func (dn *CoreOSDaemon) switchKernel(oldConfig, newConfig *mcfgv1.MachineConfig) error {
// We support Kernel update only on RHCOS and SCOS nodes
if !dn.os.IsEL() {
Expand All @@ -1148,8 +1152,8 @@ func (dn *CoreOSDaemon) switchKernel(oldConfig, newConfig *mcfgv1.MachineConfig)
defaultKernel := []string{"kernel", "kernel-core", "kernel-modules", "kernel-modules-core", "kernel-modules-extra"}
// Note this list explicitly does *not* include kernel-rt as that is a meta-package that tries to pull in a lot
// of other dependencies we don't want for historical reasons.
// kernel-rt also has a split off kernel-rt-kvm subpackage because it's in a separate subscription in RHEL.
realtimeKernel := []string{"kernel-rt-core", "kernel-rt-modules", "kernel-rt-modules-extra", "kernel-rt-kvm"}
hugePagesKernel := []string{"kernel-64k-core", "kernel-64k-modules", "kernel-64k-modules-core", "kernel-64k-modules-extra"}

if oldKtype != newKtype {
logSystem("Initiating switch to kernel %s", newKtype)
Expand All @@ -1165,6 +1169,15 @@ func (dn *CoreOSDaemon) switchKernel(oldConfig, newConfig *mcfgv1.MachineConfig)
args = append(args, "--install", pkg)
}

return runRpmOstree(args...)
} else if newKtype == ctrlcommon.KernelType64kPages {
// Switch to 64k pages kernel
args := []string{"override", "remove"}
args = append(args, defaultKernel...)
for _, pkg := range hugePagesKernel {
args = append(args, "--install", pkg)
}

return runRpmOstree(args...)
}
return fmt.Errorf("unhandled kernel type %s", newKtype)
Expand Down Expand Up @@ -1878,49 +1891,49 @@ func (dn *Daemon) InplaceUpdateViaNewContainer(target string) error {
return nil
}

// queueRevertRTKernel undoes the layering of the RT kernel
func (dn *Daemon) queueRevertRTKernel() error {
// queueRevertKernelSwap undoes the layering of the RT kernel or kernel-64k hugepages
func (dn *Daemon) queueRevertKernelSwap() error {
booted, _, err := dn.NodeUpdaterClient.GetBootedAndStagedDeployment()
if err != nil {
return err
}

// Before we attempt to do an OS update, we must remove the kernel-rt switch
// Before we attempt to do an OS update, we must remove the kernel-rt or kernel-64k switch
// because in the case of updating from RHEL8 to RHEL9, the kernel packages are
// OS version dependent. See also https://github.com/coreos/rpm-ostree/issues/2542
// (Now really what we want to do here is something more like rpm-ostree override reset --kernel
// i.e. the inverse of https://github.com/coreos/rpm-ostree/pull/4322 so that
// we're again not hardcoding even the prefix of kernel packages)
kernelOverrides := []string{}
kernelRtLayers := []string{}
kernelExtLayers := []string{}
for _, removal := range booted.RequestedBaseRemovals {
if removal == "kernel" || strings.HasPrefix(removal, "kernel-") {
kernelOverrides = append(kernelOverrides, removal)
}
}
for _, pkg := range booted.RequestedPackages {
if strings.HasPrefix(pkg, "kernel-rt-") {
kernelRtLayers = append(kernelRtLayers, pkg)
if strings.HasPrefix(pkg, "kernel-rt-") || strings.HasPrefix(pkg, "kernel-64k-") {
kernelExtLayers = append(kernelExtLayers, pkg)
}
}
// We *only* do this switch if the node has done a switch from kernel -> kernel-rt.
// We *only* do this switch if the node has done a switch from kernel -> kernel-rt or kernel-64k.
// We don't want to override any machine-local hotfixes for the kernel package.
// Implicitly in this we don't really support machine-local hotfixes for kernel-rt.
// Implicitly in this we don't really support machine-local hotfixes for kernel-rt or kernel-64k.
// The only sane way to handle that is declarative drop-ins, but really we want to
// just go to deploying pre-built images and not doing per-node mutation with rpm-ostree
// at all.
if len(kernelOverrides) > 0 && len(kernelRtLayers) > 0 {
if len(kernelOverrides) > 0 && len(kernelExtLayers) > 0 {
args := []string{"override", "reset"}
args = append(args, kernelOverrides...)
for _, pkg := range kernelRtLayers {
for _, pkg := range kernelExtLayers {
args = append(args, "--uninstall", pkg)
}
err := runRpmOstree(args...)
if err != nil {
return err
}
} else if len(kernelOverrides) > 0 || len(kernelRtLayers) > 0 {
klog.Infof("notice: detected %d overrides and %d kernel-rt layers", len(kernelOverrides), len(kernelRtLayers))
} else if len(kernelOverrides) > 0 || len(kernelExtLayers) > 0 {
klog.Infof("notice: detected %d kernel overrides and %d kernel-rt or kernel-64k layers", len(kernelOverrides), len(kernelExtLayers))
} else {
klog.Infof("No kernel overrides or replacement detected")
}
Expand Down Expand Up @@ -2094,10 +2107,10 @@ func (dn *CoreOSDaemon) applyLayeredOSChanges(mcDiff machineConfigDiff, oldConfi
}
}()

// If we have an OS update *or* a kernel type change, then we must undo the RT kernel
// If we have an OS update *or* a kernel type change, then we must undo the kernel swap
// enablement.
if mcDiff.osUpdate || mcDiff.kernelType {
if err := dn.queueRevertRTKernel(); err != nil {
if err := dn.queueRevertKernelSwap(); err != nil {
mcdPivotErr.Inc()
return err
}
Expand Down

0 comments on commit ca0f407

Please sign in to comment.