Deploy Production Postgres - LKE

Hydrolix provisions a single internal PostgreSQL pod to store the catalog (meta-data storage) by default. In a production environment we recommend a Kubernetes hosted, high-availability PostgreSQL with backups enabled, like CloudNativePG v1.25.

❗️

Potential Unrecoverable Data Loss - Please read.

If you have been loading data and this is a migration, don't proceed unless you fully understand the migration process. Catalog loss can lead to data becoming unrecoverable. To migrate an existing deployment it's strongly suggested to talk to Hydrolix support and review the following page Migrate to External PostgreSQL.

Deploy High-Availability PostgreSQL in Kubernetes

Hydrolix recommends using CloudNativePG (CNPG) when managing your own postgres cluster in Kubernetes. CNPG is external to Hydrolix. Review Installation and upgrades - CloudNativePG v1.25 for detailed instructions on installing CNPG.

After installing the CNPG operator, create a new PostgreSQL Cluster in the same namespace as your HydrolixCluster.

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: catalog
  namespace: {namespace}
spec:
  backup: {backup}
  bootstrap:
    initdb:
      database: catalog
      owner: query_api
  enableSuperuserAccess: True
  imageName: ghcr.io/cloudnative-pg/postgresql:15.12
  instances: 3
  primaryUpdateMethod: switchover
  storage:
    size: 100Gi

Hydrolix recommends defining a backup object store for archiving Write-Ahead Logging (WAL) files and backups. Refer to CNPG documentation Appendix A - Common object stores for backups - CloudNativePG v1.25 for details on supported options.

Hydrolix requires the field enableSuperuserAccess to be enabled to log in with the root user and create the additional keycloak and config API databases and users.

Apply the Cluster object and wait for the status to show as healthy.

kubectl apply -f catalog.yaml
kubectl -n "${namespace}" get cluster catalog

Define the External PostgreSQL Connection

Disable the internal PostgreSQL instance by setting scale.postgres.replicas to 0.

Provide values for catalog_db_admin_user, catalog_db_admin_db, and catalog_db_host so your Hydrolix instance can connect to your newly created external PostgreSQL endpoint.

  1. Edit your hydrolixCluster resource.
kubectl edit hydrolixcluster hdx -n "${namespace}"
  1. Fill in the values for catalog_db_admin_user, catalog_db_admin_db, and catalog_db_host. Set scale.postgres.replicas to 0.
spec:
  catalog_db_admin_user: postgres  #<--- Add the admin user "postgres"
  catalog_db_admin_db: postgres    #<--- Add the admin db "postgres"
  catalog_db_host: catalog-rw      #<--- Add the read/write svc endpoint for your catalog cluster

  scale:
    postgres:
      replicas: 0                  #<---- Set the internal postgres to 0!
  scale_profile: prod

Create Secret

Store the PostgreSQL secret within a curated Kubernetes secret.

  1. Retrieve the passwords for the PostgreSQL (root) user and the query_api user that were created.
ROOT_PWD=$(kubectl -n {namespace} get secret catalog-superuser -o jsonpath='{.data.password}' | base64 -D)
CATALOG_PWD=$(kubectl -n {namespace} get secret catalog-app -o jsonpath='{.data.password}' | base64 -D)
  1. Edit the curated secret.
kubectl edit secret curated - n {namespace}
  1. Add property stringData and the entries for both passwords. Kubernetes automatically encodes the passwords from stringData and stores them under the data. When reading the curated secret after storage, the key data will be present, but not the stringData used only for accepting unencoded input.
stringData:
  ROOT_DB_PASSWORD: ${ROOT_PWD}
  CATALOG_DB_PASSWORD: ${CATALOG_PWD}

📘

Already Running Cluster

If your cluster is already running, run the following command to redeploy the cluster with these settings applied:

kubectl rollout restart deployment