Upgrade to v4.12

This guide lists breaking changes for v4.12 and also shows you how to roll back to version 4.10 if needed.

Upgrade

Upgrading is as simple as following the steps in the Release Notes:

Upgrade on GKE

kubectl apply -f "https://www.hydrolix.io/operator/v4.12.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

Upgrade on EKS

kubectl apply -f "https://www.hydrolix.io/operator/v4.12.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

Breaking Changes and Compatibility Notices

Unified Authentication is Default

If you're upgrading from version 4.8 or earlier, note that this version of Hydrolix uses Unified Authentication by default, which is a breaking change for existing customers using Basic Authentication. To maintain the current behavior when upgrading a cluster from 4.8 or earlier, ensure that unified_auth: false is set in your cluster's configuration.

Invitation URL API Change

As of version 4.10, the /config/v1/inviteurl API endpoint has been changed to /config/v1/invites/invite_url. Refer to the API documentation to check this and other invitation system API changes.

Continuous Control Loop Operator

Rather than running just once per deployment, the operator now runs continuously, as of version 4.10. Temporary manual changes to deployments will now be overwritten by this control loop. When making manual changes, scale the operator to 0 with kubectl scale --replicas 0 deployment/operator.

Rolling Back to 4.8 with Batch Ingest

If you are using batch ingest, and need to roll back to 4.8.x after upgrading to this version, contact Hydrolix Customer Success before rolling back.

Recent Versions of PostgreSQL

This version of Hydrolix requires PostgreSQL version 11 or higher. We strongly recommend you use PostgreSQL 13 or higher for easy upgrading. If you are using PostgreSQL version 11 or 12, enable the ltree extension as superuser, and consider upgrading to an up-to-date version of PostgreSQL.

v4.12 to v4.10 Downgrade Procedure

Overview

Due to upgrades to PostgreSQL libraries in Hydrolix version v4.12, a downgrade from v4.12 to v4.10 requires the extra task of restoring an older version of the Keycloak database. This v4.10-compatible database backup was created right before your cluster was upgraded to v4.12.

Steps to follow:

  1. roll back the cluster to v4.10
  2. locate and download the backup
  3. modify scale of operator and tooling pods
  4. decrypt the backup
  5. stop Keycloak
  6. drop and recreate the Keycloak database
  7. restore the backup
  8. scale back to normal operation

Roll Back the Cluster to v4.10

This is the same as the upgrade steps normally published in the release notes, but with v4.10 as the target version. For example:

Roll Back on GKE:

kubectl apply -f "https://www.hydrolix.io/operator/v4.10.9/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

Roll Back on EKS:

kubectl apply -f "https://www.hydrolix.io/operator/v4.10.9/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

Locate and Download the Backup File

The database backup for v4.10 is located in cloud storage in the backups directory at the top level of your cloud storage.

For example, in Google Cloud Storage:

gcloud storage ls -l gs://<bucket_name>/backups

In AWS S3:

aws s3 ls --profile <profile_name> <bucket_name>/backups/

Find the name of the file which corresponds to the time at which you upgraded the cluster to v4.12. Then download that to your local workstation. Continuing the example from Google Cloud Storage (your filename will be different):

gcloud storage cp gs://<bucket_name>/backups/keycloak_v4.12.3_2024-05-14_23-41-23.sql.enc .

And AWS (again, with a different filename):

aws s3 cp --profile <profile_name> s3://<bucket_name>/backups/keycloak_v4.12.0-44-g894aca1c_2024-05-13_15-25-45.sql.enc

📘

The Version Number Can Be Misleading

Note that even though the version number in the file above states v4.12.3, it is a backup of the v4.10 database, created during the upgrade to v4.12.3.

Modify the Scale of Operator and Tooling Pods

Temporarily turn off the operator and spin up the tooling pod:

kubectl scale deployment --namespace $HDX_KUBERNETES_NAMESPACE operator --replicas=0
kubectl scale deployment --namespace $HDX_KUBERNETES_NAMESPACE tooling --replicas=1

Wait for the tooling pod to start and retrieve its name, which you will use later.

HDX_TOOLING_POD_NAME=$(kubectl get pods --namespace $HDX_KUBERNETES_NAMESPACE -l app=tooling | grep 'tooling' | awk '{print $1}')

Double-check your work with an echo $HDX_TOOLING_POD_NAME to make sure the pod name is in the shell variable.

Next, copy the encrypted database backup file to your new tooling pod:

kubectl cp <keycloak_backup_file> $HDX_KUBERNETES_NAMESPACE/$HDX_TOOLING_POD_NAME:/

Decrypt the Backup

Retrieve the database file encryption password from your Kubernetes cluster's secrets:

export HDX_ENC_PASS=$(kubectl --namespace $HDX_KUBERNETES_NAMESPACE get secret general -o jsonpath="{.data.DB_ENCRYPTION_PASSWORD}" | base64 --decode)

Decrypt the file using the tooling pod, writing a new file called keycloak_backup.sql. Make sure you replace <keycloak_backup_file> below with the actual name of the encrypted backup file you downloaded in previous steps.

kubectl --namespace $HDX_KUBERNETES_NAMESPACE exec -it $HDX_TOOLING_POD_NAME -- openssl enc -aes-256-cbc -d -pbkdf2 -iter 100000 -salt -in <keycloak_backup_file> -out keycloak_backup.sql -pass pass:"$HDX_ENC_PASS"

Upon successful completion of this command, you should have a keycloak_backup.sql file in the root directory of the tooling pod's filesystem. To check this, list the file on the tooling pod:

kubectl exec --namespace $HDX_KUBERNETES_NAMESPACE --stdin --tty $HDX_TOOLING_POD_NAME -- ls -la keycloak_backup.sql

Stop Keycloak

Scale the Keycloak pod to 0 replicas:

kubectl scale deployment --namespace $HDX_KUBERNETES_NAMESPACE keycloak --replicas=0

Drop and Recreate the Keycloak Database

Choose one of the options below, depending on whether your cluster uses PostgreSQL internally or externally through CrunchyData.

Internal PostgreSQL

Open a new shell to the tooling pod to save time while you're working there:

kubectl exec --namespace $HDX_KUBERNETES_NAMESPACE --stdin --tty $HDX_TOOLING_POD_NAME -- /bin/bash

Drop and recreate the Keycloak database:

PGPASSWORD="$ROOT_DB_PASSWORD" psql -h postgres -U turbine -c "DROP DATABASE IF EXISTS keycloak;"
PGPASSWORD="$ROOT_DB_PASSWORD" psql -h postgres -U turbine -c "CREATE DATABASE keycloak OWNER keycloak;"

External Crunchy PostgreSQL

Find the database password for external PostgreSQL instances using the kubectl command:

kubectl --namespace $HDX_KUBERNETES_NAMESPACE get secret main-pguser-postgres -o jsonpath="{.data.password}" | base64 --decode

Open a new shell to the tooling pod to save time while you're working there:

kubectl exec --namespace $HDX_KUBERNETES_NAMESPACE --stdin --tty $HDX_TOOLING_POD_NAME -- /bin/bash

Drop and recreate the Keycloak database using the Crunchy IP or the main-primary hostname and the port that the PostgreSQL database is using:

psql -h <hostname> -p <port> -U postgres -c "DROP DATABASE IF EXISTS keycloak;"
psql -h <hostname> -p <port> -U postgres -c "CREATE DATABASE keycloak OWNER keycloak;"

Restore the Backup

In the shell in the tooling pod, enter this command to restore the original data:

PGPASSWORD="$KEYCLOAK_DB_PASSWORD" pg_restore -h postgres -U keycloak -d keycloak keycloak_backup.sql

📘

Getting a Version Error?

If you're seeing a pg_restore: error: unsupported version (1.15) in file header error or something similar, update your tooling pod's PostgreSQL utilities with these steps:

apt upgrade
apt install gnupg wget lsb-release
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg
apt update
apt install postgresql-16 postgresql-contrib-16

Try the above pg_restore command again.

Once the backup is loaded into the database, exit the tooling pod with exit or ctrl-D.

Scale Back to Normal Operation

Scale up the operator and keycloak pods to 1 replica, and scale down the tooling pod:

kubectl scale deployment --namespace $HDX_KUBERNETES_NAMESPACE tooling --replicas=0  
kubectl scale deployment --namespace $HDX_KUBERNETES_NAMESPACE keycloak --replicas=1  
kubectl scale deployment --namespace $HDX_KUBERNETES_NAMESPACE operator --replicas=1

Within a few minutes, your cluster should be back up and running the old version of Hydrolix.