Skip to content

Commit

Permalink
Nova services FFU during adoption (no extra cell)
Browse files Browse the repository at this point in the history
Update EDPM adoption docs and tests to execute Nova compute post-FFU.

Verify no dataplane disruptions during the FFU/adoption process.

Verify Nova services still control pre-created VM workload after
FFU/adotpion is done.

Signed-off-by: Bohdan Dobrelia <[email protected]>
  • Loading branch information
bogdando committed Nov 6, 2023
1 parent f2a7a19 commit d591e29
Show file tree
Hide file tree
Showing 4 changed files with 415 additions and 2 deletions.
203 changes: 202 additions & 1 deletion docs/openstack/edpm_adoption.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,15 @@

## Variables

(There are no shell variables necessary currently.)
Define the shell variables used in the Fast-forward upgrade steps below.
The values are just illustrative, use values that are correct for your environment:

```bash
PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
CONTROLLER_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]"

alias openstack="oc exec -t openstackclient -- openstack"
```

## Pre-checks

Expand Down Expand Up @@ -263,3 +271,196 @@ EOF
```
oc wait --for condition=Ready osdpns/openstack --timeout=30m
```
## Nova compute services fast-forward upgrade from Wallaby to Antelope
Nova services rolling upgrade cannot be done during adoption,
there is in a lock-step with Nova control plane services, because those
are managed independently by EDPM ansible, and Kubernetes operators.
Nova service operator and OpenStack Dataplane operator ensure upgrading
is done independently of each other, by configuring
`[upgrade_levels]compute=auto` for Nova services. Nova control plane
services apply the change right after CR is patched. Nova compute EDPM
services will catch up the same config change with ansible deployment
later on.
> **NOTE**: Additional orchestration happening around the FFU workarounds
> configuration for Nova compute EDPM service is a subject of future changes.
* Configure pre-FFU workarounds for Nova compute EDPM services to update its version records:
```yaml
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: nova-compute-workarounds
namespace: openstack
data:
19-nova-compute-cell1-workarounds.conf: |
[workarounds]
disable_compute_service_check_for_ffu=true
EOF
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
name: nova-compute-workarounds
namespace: openstack
spec:
label: nova.compute.workarounds
configMaps:
- nova-compute-workarounds
playbook: osp.edpm.nova
---
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
name: openstack-nova-compute-workarounds
namespace: openstack
spec:
nodeSets:
- openstack
servicesOverride:
- nova-compute-workarounds
EOF
```
* Wait for cell1 Nova compute EDPM services version updated (it may take some time):
```bash
oc exec -it mariadb-openstack-cell1 -- mysql --user=root --password=${PODIFIED_DB_ROOT_PASSWORD} \
-e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute';"
```
The above query should return an empty result as a completion criterion.
* Remove pre-FFU workarounds for Nova control plane services:
```yaml
oc patch openstackcontrolplane openstack -n openstack --type=merge --patch '
spec:
nova:
template:
cellTemplates:
cell0:
conductorServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
cell1:
metadataServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
conductorServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
apiServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
metadataServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
schedulerServiceTemplate:
customServiceConfig: |
[workarounds]
disable_compute_service_check_for_ffu=false
'
```
* Wait for Nova control plane services' CRs to become ready:
```bash
oc get novaapis --field-selector metadata.name=nova-api -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novacells --field-selector metadata.name=nova-cell0 -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novacells --field-selector metadata.name=nova-cell1 -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novaconductors --field-selector metadata.name=nova-cell0-conductor -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novaconductors --field-selector metadata.name=nova-cell1-conductor -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novametadata --field-selector metadata.name=nova-metadata -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novanovncproxies --field-selector metadata.name=nova-cell1-novncproxy -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
oc get novaschedulers --field-selector metadata.name=nova-scheduler -o jsonpath='{.items[0].status.conditions}' \
| jq -e '.[]|select(.type=="Ready" and .status=="True")'
```
* Remove pre-FFU workarounds for Nova compute EDPM services:
```yaml
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: nova-compute-ffu
namespace: openstack
data:
20-nova-compute-cell1-ffu-cleanup.conf: |
[workarounds]
disable_compute_service_check_for_ffu=false
EOF
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
name: nova-compute-ffu
namespace: openstack
spec:
label: nova.compute.ffu
configMaps:
- nova-compute-ffu
playbook: osp.edpm.nova
---
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
name: openstack-nova-compute-ffu
namespace: openstack
spec:
nodeSets:
- openstack
servicesOverride:
- nova-compute-ffu
EOF
```
* Wait for Nova compute EDPM service to become ready:
```bash
oc wait --for condition=Ready osdpd/openstack-nova-compute-ffu --timeout=5m
```
* Run Nova DB online migrations to complete FFU:
```bash
oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations
oc exec -it nova-cell1-conductor-0 -- nova-manage db online_data_migrations
```
* Verify no Nova compute dataplane disruptions during the adoption/upgrade process:
```bash
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 running'
```
* Verify if Nova services control the existing VM instance:
```bash
openstack server list | grep -qF '| test | ACTIVE |' && openstack server stop test
openstack server list | grep -qF '| test | SHUTOFF |'
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 shut off'
openstack server list | grep -qF '| test | SHUTOFF |' && openstack server start test
openstack server list | grep -F '| test | ACTIVE |'
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 running'
```
Note that in this guide, the same host acts as a controller, and also as a compute.
2 changes: 1 addition & 1 deletion docs/openstack/nova_adoption.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ Compare the following outputs with the topology specific configuration
* Default cell is renamed to `cell1` (in a multi-cell setup, it should become indexed as the last cell instead).
* RabbitMQ transport URL no longer uses `guest`.

* Verify no Nova compute dataplane disruptions during the adoption process:
* Verify no Nova compute dataplane disruptions during the adoption/upgrade process:

```bash
$CONTROLLER_SSH sudo podman exec -it libvirt_virtqemud virsh list --all | grep 'instance-00000001 running'
Expand Down
4 changes: 4 additions & 0 deletions tests/roles/dataplane_adoption/tasks/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -265,3 +265,7 @@
oc wait --for condition=Ready osdpns/openstack --timeout=40m
# TODO: work on network configuration for making possible to run this task on other IP ranges
when: "edpm_node_ip.startswith('192.168.122')"

- name: Complete Nova services Wallaby->Antelope FFU
ansible.builtin.include_tasks:
file: nova_ffu.yaml
Loading

0 comments on commit d591e29

Please sign in to comment.