Skip to content

Deploy Production PostgreSQL

Hydrolix provisions a single internal PostgreSQL pod to store the catalog. This default configuration has no high availability or automated backups and isn't suitable for production.

For production deployments, choose an option:

  • CloudNativePG: Manages a high-availability PostgreSQL cluster inside the AKS cluster. Best for deployments that keep all components in Kubernetes and back up to object storage.
  • Azure Database for PostgreSQL: A fully managed PostgreSQL service in Azure. Best for deployments that prefer an externally managed database with Azure-native backups and maintenance.

New deployments don't require migration

For new Hydrolix installations, complete all steps on this page after deploying the Operator and HydrolixCluster resource, but before ingesting any data. The Operator creates the required databases, users, and permissions on the external or CNPG instance. No migration is needed.

Existing deployments require migration

Switching an existing deployment from the internal PostgreSQL pod to an external database requires a catalog migration. Catalog loss can lead to data becoming unrecoverable. Contact Hydrolix support and review Migrate to External PostgreSQL before proceeding.

Prerequisites⚓︎

For Azure Database for PostgreSQL, also:

  • Azure CLI installed and authenticated with permission to create resources in the AKS resource group.

Deploy high-availability PostgreSQL in Kubernetes⚓︎

Use CloudNativePG (CNPG) to manage a high-availability PostgreSQL cluster in Kubernetes. CNPG is external to Hydrolix.

  1. Install the CNPG operator. See Installation and upgrades - CloudNativePG for instructions.

  2. Create a catalog.yaml file with this minimal configuration:

    Catalog Cluster Minimal Configuration
    apiVersion: postgresql.cnpg.io/v1
    kind: Cluster
    metadata:
      name: catalog
      namespace: {namespace}
    spec:
      backup: {backup}      # -- Specify the object storage path here
      bootstrap:
        initdb:
          database: catalog
          owner: query_api
      enableSuperuserAccess: true
      imageName: ghcr.io/cloudnative-pg/postgresql:15.12
      instances: 3
      primaryUpdateMethod: switchover
      storage:
        size: 100Gi
    
    • backup: Specify an object storage path for archiving Write-Ahead Logging (WAL) files and backups. See Appendix C - Common object stores for backups - CloudNativePG for supported options.
    • owner: query_api: The PostgreSQL role that owns the catalog database. Hydrolix uses this role internally - don't change this value.
    • enableSuperuserAccess: Must be true so Hydrolix can log in with the root user to create the Keycloak and Config API databases and users.
  3. Apply the Cluster object and wait for the status to show as healthy.

    Apply and Verify Catalog Cluster
    kubectl apply -f catalog.yaml
    kubectl -n "${namespace}" get cluster catalog
    

    A healthy cluster shows Cluster in healthy state in the STATUS column with all instances ready:

    Expected Output
    NAME      AGE   INSTANCES   READY   STATUS                     PRIMARY
    catalog   5m    3           3       Cluster in healthy state   catalog-1
    

Configure Azure Database for PostgreSQL⚓︎

New deployments only

This section explains how to configure a new Hydrolix deployment to use an external PostgreSQL instance. To migrate an existing deployment, see Migrate to External PostgreSQL.

Configure an Azure Database for PostgreSQL flexible server instance in the same virtual network (VNet) as the AKS cluster. Deploy the instance without a public IP address, and in the same region as the AKS cluster for best performance. Size the instance based on the criteria in Scale profiles. For example, at Mega scale, provision an instance with 100 GB disk, six CPUs, and 24 GB of memory.

  1. Create the flexible server instance.

    Create Flexible Server Instance
    1
    2
    3
    4
    5
    6
    7
    8
    9
    az postgres flexible-server create \
      --resource-group $HDX_AZURE_RG \
      --name $HDX_KUBERNETES_NAMESPACE \
      --location $HDX_AZURE_REGION \
      --admin-user hdxpgadmin \
      --admin-password <PASSWORD> \
      --sku-name Standard_D4s_v3 \
      --storage-size 102400 \
      --version 15
    
  2. Create the Hydrolix database.

    Create Hydrolix Database
    1
    2
    3
    4
    az postgres flexible-server db create \
      --resource-group $HDX_AZURE_RG \
      --server-name $HDX_KUBERNETES_NAMESPACE \
      --database-name hdx
    
  3. Note the hostname Azure provides for the server. Use this hostname as the catalog_db_host value when editing the HydrolixCluster resource.

Define the external PostgreSQL connection⚓︎

Disable the internal PostgreSQL instance and configure Hydrolix to connect to the external PostgreSQL endpoint.

  1. Edit the HydrolixCluster resource.

    Edit HydrolixCluster
    kubectl edit hydrolixcluster hdx -n {namespace}
    
  2. Fill in the values for catalog_db_admin_user, catalog_db_admin_db, and catalog_db_host. Set scale.postgres.replicas to 0.

    • For CloudNativePG, use catalog-rw as the catalog_db_host value. This is the CNPG read-write service endpoint that routes to the primary instance.
    • For an external managed PostgreSQL service, use the endpoint the cloud provider supplies.
    Values to Populate in hydrolixcluster.yaml
    1
    2
    3
    4
    5
    6
    7
    8
    9
    spec:
      catalog_db_admin_user: postgres      #<--- Admin user
      catalog_db_admin_db: postgres        #<--- Database for initial admin connection
      catalog_db_host: <postgresql-host>   #<--- catalog-rw for CNPG, or the external endpoint
    
      scale:
        postgres:
          replicas: 0                      #<---- Disable the internal PostgreSQL pod
      scale_profile: prod
    

Create the secret⚓︎

Store the PostgreSQL credentials in a curated Kubernetes secret.

If using CloudNativePG, retrieve the auto-generated passwords first.

Retrieve CNPG Passwords
ROOT_PWD=$(kubectl -n {namespace} get secret catalog-superuser -o jsonpath='{.data.password}' | base64 -d)
CATALOG_PWD=$(kubectl -n {namespace} get secret catalog-app -o jsonpath='{.data.password}' | base64 -d)
  1. Edit the curated secret.

    Edit Curated Secret
    kubectl edit secret curated -n {namespace}
    
  2. Add the stringData property with the required credentials. Kubernetes encodes values from stringData and stores them in data. When reading the curated secret, only the data key is present.

    For CloudNativePG, include both passwords:

    CNPG Credentials for curated Secret
    1
    2
    3
    stringData:
      ROOT_DB_PASSWORD: ${ROOT_PWD}
      CATALOG_DB_PASSWORD: ${CATALOG_PWD}
    

    For an externally managed PostgreSQL service, include only the admin password set when creating the instance:

    External PostgreSQL Credentials for curated Secret
    stringData:
      ROOT_DB_PASSWORD: <postgresql-admin-password>
    

New and existing deployments

The Operator picks up the secret on first deploy for new deployments.

If the Hydrolix cluster is already running, restart all deployments to apply the new credentials. Secret changes don't trigger automatic restarts.

Restart All Deployments
kubectl rollout restart deployment