Skip to content

Commit

Permalink
fix(docs): clarify clean-up of indices when restoring search and grap…
Browse files Browse the repository at this point in the history
…h indices
  • Loading branch information
Masterchen09 committed Sep 16, 2024
1 parent dc35465 commit 44733fe
Showing 1 changed file with 43 additions and 8 deletions.
51 changes: 43 additions & 8 deletions docs/how/restore-indices.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,39 @@ When a new version of the aspect gets ingested, GMS initiates an MAE event for t
the search and graph indices. As such, we can fetch the latest version of each aspect in the local database and produce
MAE events corresponding to the aspects to restore the search and graph indices.

By default, restoring the indices from the local database will not remove any existing documents in
the search and graph indices that no longer exist in the local database, potentially leading to inconsistencies
between the search and graph indices and the local database.

## Quickstart

If you're using the quickstart images, you can use the `datahub` cli to restore indices.
If you're using the quickstart images, you can use the `datahub` cli to restore the indices.

```
```shell
datahub docker quickstart --restore-indices
```
See [this section](../quickstart.md#restoring-only-the-index-use-with-care) for more information.

:::info
Using the `datahub` CLI to restore the indices when using the quickstart images will also clear the search and graph indices before restoring.

See [this section](../quickstart.md#restore-datahub) for more information.

## Docker-compose

If you are on a custom docker-compose deployment, run the following command (you need to checkout [the source repository](https://github.com/datahub-project/datahub)) from the root of the repo to send MAE for each aspect in the Local DB.
If you are on a custom docker-compose deployment, run the following command (you need to checkout [the source repository](https://github.com/datahub-project/datahub)) from the root of the repo to send MAE for each aspect in the local database.

```
```shell
./docker/datahub-upgrade/datahub-upgrade.sh -u RestoreIndices
```

If you need to clear the search and graph indices before restoring, add `-a clean` to the end of the command.
:::info
By default this command will not clear the search and graph indices before restoring, thous potentially leading to inconsistencies between the local database and the indices, in case aspects were previously deleted in the local database but were not removed from the correponding index.

If you need to clear the search and graph indices before restoring, add `-a clean` to the end of the command. Please take note that the search and graph services might not be fully functional during reindexing when the indices are cleared.

```shell
./docker/datahub-upgrade/datahub-upgrade.sh -u RestoreIndices -a clean
```

Refer to this [doc](../../docker/datahub-upgrade/README.md#environment-variables) on how to set environment variables
for your environment.
Expand All @@ -44,11 +59,31 @@ If not, deploy latest helm charts to use this functionality.

Once restore indices job template has been deployed, run the following command to start a job that restores indices.

```
```shell
kubectl create job --from=cronjob/datahub-datahub-restore-indices-job-template datahub-restore-indices-adhoc
```

Once the job completes, your indices will have been restored.
Once the job completes, your indices will have been restored.

:::info
By default the restore indices job template will not clear the search and graph indices before restoring, thous potentially leading to inconsistencies between the local database and the indices, in case aspects were previously deleted in the local database but were not removed from the correponding index.

If you need to clear the search and graph indices before restoring, modify the `values.yaml` for your deployment and overwrite the default arguments of the restore indices job template to include the `-a clean` argument. Please take note that the search and graph services might not be fully functional during reindexing when the indices are cleared.

```yaml
datahubUpgrade:
restoreIndices:
image:
args:
- "-u"
- "RestoreIndices"
- "-a"
- "batchSize=1000" # default value of datahubUpgrade.batchSize
- "-a"
- "batchDelayMs=100" # default value of datahubUpgrade.batchDelayMs
- "-a"
- "clean"
```
## Through API
Expand Down

0 comments on commit 44733fe

Please sign in to comment.