Skip to content

Latest commit

 

History

History
294 lines (200 loc) · 12.6 KB

archipelago-deployment-live-search_solr_index.md

File metadata and controls

294 lines (200 loc) · 12.6 KB
title tags
Archipelago-deployment-live: Upgrading Solr
Archipelago-deployment-live
Drupal 10
Solr 8
Solr 9

Archipelago-deployment-live: Upgrading Solr

What is this documentation for?

This documentation will help you ugprade your Solr Index from 8.x to 9.x or inside 9.x releases. And is meant to be a guide/helper. There is no simple way of saying this, but because the way a Solr index (sort of a Binary tree) is build, any larger change in the schema, field type definitions requires either a complete reindex but really, most of the time, a wipe, and start fresh situation. There is no perfect way around. There are very complex ways of keeping an old Server running and serving searches while you re-index a new one, but honestly, the how and approach will depend on your existing knowledge of Solr, your skills (even memory!) to execute so, and documenting those hacks are beyond the scope of this documentation. What is proven is what we explain in this document

Requirements

  • An archipelago-deployment-live instance (working, tested) deployed using provided instructions via Docker running either Solr 8.x or 9.x
  • Good knowledge, patience and instincts (+ courage and time) on how to run Terminal Commands.
  • Patience(again but also patience from your users since search will be unavailable until you reindex). You can't skip steps here.
  • For shell Commands documented here please copy line by line--not the whole block.
  • You are running already version control and know how to git pull/push/merge.

Backing up and preparing for the upgrade

Backups are always going to be your best friends. Archipelago's code, database, and settings are mostly self-contained in your current archipelago-deployment-live repo folder, and backing up is simple because of that.

Step 1:

To make upgrading simpler we will clone archipelago-deployment-live (empty one) into a different folder. That way we can copy complete folders of configs and files instead of fetching them from github one by one.

Go to your home folder (for the sake of this documentation it will be /home/ec2-user but you can also use the $HOME environmental variable instead )

cd /home/ec2-user
git clone https://github.com/esmero/archipelago-deployment-live archipelago-deployment-live-1.4.0
cd archipelago-deployment-live-1.4.0
git switch 1.4.0

Now, on a terminal, cd into your running archipelago-deployment-live folder, then cd inside the deploy/ec2-docker subfolders and shut down your docker-compose ensemble by running the following:

docker-compose down

Step 2:

Verify that all containers are actually down. The following command should return an empty listing:

docker ps

If anything is still running, wait a little longer and run the command again.

Step 3:

Now let's tar.gz the whole ensemble with data and configs. We will exclude here the local source caches generated by Cantaloupe. If these or not exist will depend on how custom your deployment is.

As an example we will save this into your $HOME folder. As a good practice we append the current date (YEAR-MONTH-DAY) to the filename. Here we assume today is July 7th of 2024.

We will cd back to the parent folder of your running archipelago-deployment-live folder, so three levels down, assuming you are right now inside archipelago-deployment-live/deploy/ec2-docker

cd ../../..
sudo tar --exclude=archipelago-deployment-live/data_storage/iiifcache --exclude=archipelago-deployment-live/data_storage/iiiftmp -czvpf $HOME/archipelago-deployment-D10-20240707.tar.gz archipelago-deployment-live

The process may take a few minutes. Now let's verify that all is there and that the tar.gz is not corrupt.

tar -tvvf $HOME/archipelago-deployment-D10-20240707.tar.gz

You will see a listing of files, and at the end you will see something like this: Archive Format: POSIX pax interchange format, Compression: gzip. If corrupt (Do you have enough space? Did your ssh connection drop?) you will see the following:

tar: Unrecognized archive format

Step 4:

cd again into your running archipelago-deployment-live folder, then cd inside the deploy/ec2-docker Restart your docker-compose ensemble, and wait a little while for all to start.

docker-compose up -d

Step 5:

Export/backup all of your live Archipelago 1.3.0, Drupal 10 configurations (this allows you to compare/come back in case you lose something custom during the upgrade).

docker exec esmero-php mkdir config/backup
docker exec esmero-php drush cex --destination=/var/www/html/config/backup

Good. Now it's safe to begin the upgrade process.


Upgrading to Solr 9.2.1

Step 0: Get familiar with what changed.

Running a Production Server requires some informed decision making and thus, we believe, a good pre-step is reviewing what changed between releases. In specific focus on this folder.

https://github.com/esmero/archipelago-deployment-live/tree/1.4.0/config_storage/solrconfig/conf

and

https://github.com/esmero/archipelago-deployment-live/tree/1.4.0/data_storage/solrlib

Also (please) read the official documentation here https://solr.apache.org/guide/8_9/solr-upgrade-notes.html

Step 1: Edit docker-composer.yml

You want to replace your current Solr Service (in its enterity. Please make sure indendation is 1:1).

 solr:
    container_name: esmero-solr
    restart: always
    image: "solr:9.2.1"
    # If running Docker < 20.10.10 please uncomment the following lines
    # See https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-9.html#solr-9-2 
    #security_opt:
    #  - seccomp:unconfined
    tty: true
    environment:
      SOLR_HEAP: 1024m
      SOLR_OPTS: -Dsolr.jetty.request.header.size=65535 -Dsolr.modules=scripting
    ports:
      - "8983:8983"
    networks:
      - host-net
      - esmero-net
    volumes:
      - ${ARCHIPELAGO_ROOT}/data_storage/solrcore:/var/solr/data
      - ${ARCHIPELAGO_ROOT}/config_storage/solrconfig:/drupalconfig
      - ${ARCHIPELAGO_ROOT}/data_storage/solrlib:/opt/solr/contrib/archipelago/lib
    entrypoint:
      - docker-entrypoint.sh
      - solr-precreate
      - drupal
      - /drupalconfig

You can also use any of these as reference:

For this, if you have not already, run:

docker-compose down

Then open your docker-compose.yml file, find the solr: key and replace with this new settings. If for some unknown reason your voluments do not match our defaults, please adapt to your custom edits so they match where the Solr Core and Libraries are saved:

nano docker-compose.yml

Save your changes.

Step 2: Wipe clean. Get the new configs. Get the new OCR Highlight library

Wait! (breath.)

This step is only required if you are moving from Solr 8.x to 9.x or inside 9.x you have solr field type definition changes. If you had a stock 1.3.0 with solr 9.1 and want to move to solr 9.2 you can skip deleting everything and can jump to Step 3!

This step requires some nerve. Be sure you know where you are inside your terminal (always)

Inside your archipelago-deployment-live folder run:

cd data_storage/solrcore
pwd

You should see something like

/home/ec2-user/archipelago-deployment-live/data_storage/solrcore

Which means you are in the correct folder. Now time to clean your index (really think twice here ok? You have a backup. Never run any of these without a backup)

sudo rm -rf *

Now we need the new configurations for your Solr (so then docker container can re-create the index from scratch). Remember we downloaded a reference/empty Archipelago Deployment Live 1.4.0 at /home/ec2-user/archipelago-deployment-live-1.4.0. We are going to use the files there to replace your own configs. cd back to your live deployment assuming here it is (still) /home/ec2-user/archipelago-deployment-live

cd /home/ec2-user/archipelago-deployment-live
cp -rpv /home/ec2-user/archipelago-deployment-live-1.4.0/config_storage/solrconfig/conf/* /home/ec2-user/archipelago-deployment-live/config_storage/solrconfig/conf/.

Now we need to remove the old OCR library and replace with the new one

rm /home/ec2-user/archipelago-deployment-live/data_storage/solrlib/solr-ocrhighlighting-0.7.1.jar
cp -rpv /home/ec2-user/archipelago-deployment-live-1.4.0/data_storage/solrlib/solr-ocrhighlighting-0.8.4-SNAPSHOT.jar /home/ec2-user/archipelago-deployment-live//data_storage/solrlib/.

Step 3: docker pull and check

Optional: You might want to review/compare (now) your current Drupal Search API Index against our most current one here:

In specific:

and anything that starts with search_api. too.

Time to fetch the latest Solr that will re-create the index:

docker compose pull
docker compose up -d

Give all a little time to start. Please be patient. To ensure all is well, run (more than once if necessary) the following:

docker ps

You should see something like this if you synced all containers to the latest (your versions and databse might vary depending on your server's platform):

CONTAINER ID   IMAGE                                      COMMAND                  CREATED          STATUS          PORTS                              NAMES
5b06ee366f58   jonasal/nginx-certbot                      "/docker-entrypoint.…"   10 minutes ago   Up 10 minutes   0.0.0.0:8001->80/tcp               esmero-web
86b685008158   solr:9.2.1                                 "docker-entrypoint.s…"   10 minutes ago   Up 10 minutes   0.0.0.0:8983->8983/tcp             esmero-solr
a4872b237e17   esmero/cantaloupe-s3:6.0.1-multiarch       "sh -c 'java -Dcanta…"   10 minutes ago   Up 10 minutes   0.0.0.0:8183->8182/tcp             esmero-cantaloupe
bec0b31f3421   mariadb:10.6.18-focal                      "docker-entrypoint.s…"   10 minutes ago   Up 10 minutes   3306/tcp                           esmero-db
85bedadf9732   redis:6.2-alpine                           "docker-entrypoint.s…"   10 minutes ago   10 minutes ago                                     esmero-redis
6a9e9d8647a9   minio/minio:RELEASE.2022-06-11T19-55-32Z   "/usr/bin/docker-ent…"   10 minutes ago   Up 10 minutes   0.0.0.0:9000-9001->9000-9001/tcp   esmero-minio
bc5327680ca7   esmero/php-8.1-fpm:1.2.0-multiarch         "docker-php-entrypoi…"   10 minutes ago   Up 10 minutes   9000/tcp                           esmero-php
d53729be1211   esmero/esmero-nlp:fasttext-multiarch       "/usr/local/bin/entr…"   10 minutes ago   Up 10 minutes   0.0.0.0:6400->6400/tcp             esmero-nlp

Important here is the STATUS column. It needs to be a number that goes up in time every time you run docker ps again (and again).

Check your Solr logs

docker logs -f esmero-solr -n 100 for failure messages.

Step 4: Re-index Drupal Search API

If you decide/not decide to syncronize Drupal's Search API Server, Index and fields is personal decision (optional). We do recommend it but your Solr Index might have many customizations already, so a better/pro approach would be do diff the .yml files and decide selectively. Once you decided/did that/skipped. Time to reindex

Run the following:

docker exec esmero-php drush search-api-reindex
docker exec esmero-php drush search-api-index

Check your Drupal logs, try some searches.

If you made it this far you are done with code/devops (are we ever ready?), and that means you should be able to (hopefully) stay in the Drupal 10.x realm for a few years!

Done!


Need help? Blue Screen? Missed a step? Need a hug or someone that listens to you in silence?

If you see any issues or errors or need help with a step, please let us know (ASAP!). You can either open an issue in this repository or use the Google Group. We are here to help.

Caring & Coding + Fixing + Testing

License

GPLv3