Skip to content

Commit

Permalink
Nova services adoption (no extra cell)
Browse files Browse the repository at this point in the history
Note about remapping cell DB names from OSP cells naming scheme
to the NG scheme with the superconductor layout.

Add a step to rename default cell as cell1, and to delete stale
Nova services records from cell1 DB during initial databases import,
to properly transition it into a superconductor layout later on.

Adjust minor gaps in the dependencies adoption docs (Placement,
Nova cells DB, OVN etc.)

Address the switch for service overrides spec instead of
externalEndpoints, where it is missing on the path to Nova adotpion.

Remove Nova Metadata secret creation workarounds from the EDPM
adotopion docs and test suits.

Provide workaround for renaming 'default' cell's DB during adoption.

Add test suits for Nova CP services adoption.

Update EDPM adoption docs and tests to execute Nova compute post-FFU.

Add missing nova and libvirt services for the edpm adoption tests.

Verify no dataplane disruptions during the adoption and upgrade
process.

Verify Nova services still control pre-created VM workload after
FFU/adotpion is done.

Update and fix the composition of the services pre-check list to
execute it before stopping services.

Update and fix the composition of the list of the services to be
stopped (cannot pull data from stopped services).

Stop Nova services in stop_openstack_services instead of edpm_adoption
(that was too late to do that).

Get services topology specific configuration in
pull_openstack_configuration. Add missing role for that as well.

Also note about cleaning up delorean repos for tripleo standalone dev
env.

Signed-off-by: Bohdan Dobrelia <[email protected]>
  • Loading branch information
bogdando committed Oct 31, 2023
1 parent aeb0add commit 9f9be89
Show file tree
Hide file tree
Showing 28 changed files with 1,076 additions and 185 deletions.
2 changes: 2 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ Perform the actions from the sub-documents in the following order:

* [Placement adoption](openstack/placement_adoption.md)

* [Nova adoption](openstack/nova_adoption.md)

* [Cinder adoption](openstack/cinder_adoption.md)

* [Horizon adoption](openstack/horizon_adoption.md)
Expand Down
251 changes: 203 additions & 48 deletions docs/openstack/edpm_adoption.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@

## Variables

(There are no shell variables necessary currently.)
Define the shell variables used in the Fast-forward upgrade steps below:

```bash
PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
```

## Pre-checks

Expand Down Expand Up @@ -95,55 +99,10 @@ EOF
$(cat ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa | base64 | sed 's/^/ /')
EOF
```
* Create the Nova Metadata secret (Workaround while nova isn't adopted yet):
```bash
oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: nova-metadata-neutron-config
data:
05-nova-metadata.conf: |
$(echo "[DEFAULT]\nnova_metadata_host = 1.2.3.4\nnova_metadata_port = 8775\nnova_metadata_protocol = http\nmetadata_proxy_shared_secret = 1234567842\n" | base64 | sed 's/^/ /')
EOF
```
* Stop the nova services.
```bash
# Update the services list to be stopped
ServicesToStop=("tripleo_nova_api_cron.service"
"tripleo_nova_api.service"
"tripleo_nova_compute.service"
"tripleo_nova_conductor.service"
"tripleo_nova_libvirt.target"
"tripleo_nova_metadata.service"
"tripleo_nova_migration_target.service"
"tripleo_nova_scheduler.service"
"tripleo_nova_virtlogd_wrapper.service"
"tripleo_nova_virtnodedevd.service"
"tripleo_nova_virtproxyd.service"
"tripleo_nova_virtqemud.service"
"tripleo_nova_virtsecretd.service"
"tripleo_nova_virtstoraged.service"
"tripleo_nova_vnc_proxy.service")
echo "Stopping nova services"
for service in ${ServicesToStop[*]}; do
echo "Stopping the $service in each controller node"
$CONTROLLER1_SSH sudo systemctl stop $service
$CONTROLLER2_SSH sudo systemctl stop $service
$CONTROLLER3_SSH sudo systemctl stop $service
done
```
* Deploy OpenStackDataPlaneNodeSet:
```
```yaml
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
Expand All @@ -160,6 +119,8 @@ done
- install-os
- configure-os
- run-os
- libvirt
- nova
- ovn
env:
- name: ANSIBLE_CALLBACKS_ENABLED
Expand Down Expand Up @@ -276,7 +237,7 @@ done
* Deploy OpenStackDataPlaneDeployment:
```
```yaml
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
Expand All @@ -302,6 +263,200 @@ done
```
* Wait for the dataplane node set to reach the Ready status:
```
oc wait --for condition=Ready osdpns/openstack --timeout=30m
```
## Nova compute services fast-forward upgrade from Wallaby to Antelope
Nova services rolling upgrade cannot be done during adoption,
there is in a lock-step with Nova control plane services, because those
are managed independently by EDPM ansible, and Kubernetes operators.
Nova service operator and OpenStack Dataplane operator ensure upgrading
is done independently of each other, by configuring
`[upgrade_levels]compute=auto` for Nova services. Nova control plane
services apply the change right after CR is patched. Nova compute EDPM
services will catch up the same config change with ansible deployment
later on.
> **NOTE**: Additional orchestration happening around the FFU workarounds
> configuration for Nova compute EDPM service is a subject of future changes.
* Configure pre-FFU workarounds for Nova compute EDPM services to update its version records:
```yaml
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: nova-compute-workarounds
namespace: openstack
data:
19-nova-compute-cell1-workarounds.conf: |
[workarounds]
disable_compute_service_check_for_ffu=true
EOF
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
name: nova-compute-workarounds
namespace: openstack
spec:
label: nova.compute.workarounds
configMaps:
- nova-compute-workarounds
playbook: osp.edpm.nova
---
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
name: openstack-nova-compute-workarounds
namespace: openstack
spec:
nodeSets:
- openstack
servicesOverride:
- nova-compute-workarounds
EOF
```
* Wait for cell1 Nova compute EDPM services version updated (it may take some time):
```bash
oc exec -it mariadb-openstack-cell1 -- mysql --user=root --password=${PODIFIED_DB_ROOT_PASSWORD} \
-e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute';"
```
The above query should return an empty result as a completion criterion.
* Remove pre-FFU workarounds for Nova control plane services:
```yaml
oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
spec:
nova:
template:
cellTemplates:
cell0:
conductorServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
cell1:
metadataServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
conductorServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
apiServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
metadataServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
schedulerServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
'
```
* Wait for Nova control plane services' CRs to become ready:
```bash
oc get novaapis --field-selector metadata.name=nova-api -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novacells --field-selector metadata.name=nova-cell0 -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novacells --field-selector metadata.name=nova-cell1 -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novaconductors --field-selector metadata.name=nova-cell0-conductor -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novaconductors --field-selector metadata.name=nova-cell1-conductor -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novametadata --field-selector metadata.name=nova-metadata -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novanovncproxies --field-selector metadata.name=nova-cell1-novncproxy -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novaschedulers --field-selector metadata.name=nova-scheduler -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
```
* Remove pre-FFU workarounds for Nova compute EDPM services:
```yaml
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: nova-compute-ffu
namespace: openstack
data:
20-nova-compute-cell1-ffu-cleanup.conf: |
[workarounds]
disable_compute_service_check_for_ffu=false
EOF
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
name: nova-compute-ffu
namespace: openstack
spec:
label: nova.compute.ffu
configMaps:
- nova-compute-ffu
playbook: osp.edpm.nova
---
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
name: openstack-nova-compute-ffu
namespace: openstack
spec:
nodeSets:
- openstack
servicesOverride:
- nova-compute-ffu
EOF
```
* Wait for Nova compute EDPM service to become ready:
```bash
oc wait --for condition=Ready osdpd/openstack-nova-compute-ffu --timeout=5m
```
* Run Nova DB online migrations to complete FFU:
```bash
oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations
oc exec -it nova-cell1-conductor-0 -- nova-manage db online_data_migrations
```
* Verify no Nova compute dataplane disruptions during the adoption/upgrade process:
```bash
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 running'
```
* Verify if Nova services control the existing VM instance:
```bash
openstack server list | grep -qF '| test | ACTIVE |' && openstack server stop test
openstack server list | grep -qF '| test | SHUTOFF |'
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 shut off'
openstack server list | grep -qF '| test | SHUTOFF |' && openstack server start test
openstack server list | grep -F '| test | ACTIVE |'
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 running'
```
Note that in this guide, the same host acts as a controller, and also as a compute.
5 changes: 3 additions & 2 deletions docs/openstack/keystone_adoption.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
## Prerequisites

* Previous Adoption steps completed. Notably, the service databases
must already be imported into the podified MariaDB.
* Previous Adoption steps completed. Notably,
* the [service databases](mariadb_copy.md)
must already be imported into the podified MariaDB.

## Variables

Expand Down
Loading

0 comments on commit 9f9be89

Please sign in to comment.