Skip to content

Deploy Production PostgreSQL

By default, Hydrolix provisions a single internal PostgreSQL pod to store the catalog. Where production scales are required, use either an external instance of PostgreSQL or a Kubernetes hosted, high-availability PostgreSQL like CloudNativePG v1.25. This page describes how these options can be used in running a Hydrolix deployment on GKE.

❗️ Potential Unrecoverable Data Loss - Please read.

If you have been loading data and this is a migration, don't proceed unless you fully understand the migration process. Catalog loss can lead to data becoming unrecoverable. To migrate an existing deployment it's strongly suggested to talk to Hydrolix support and review the following page Migrate to External PostgreSQL.

Deploy High-Availability PostgreSQL in Kubernetes⚓︎

Hydrolix recommends using CloudNativePG (CNPG) when managing your own postgres cluster in Kubernetes. CNPG is external to Hydrolix. Review Installation and upgrades - CloudNativePG v1.25 for detailed instructions on installing CNPG.

After installing the CNPG operator, create a new PostgreSQL Cluster in the same namespace as your HydrolixCluster. ```yaml catalog.yaml minimal configuration apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: catalog namespace: {namespace} spec: backup: {backup} # -- Specify the object storage path here bootstrap: initdb: database: catalog owner: query_api enableSuperuserAccess: True imageName: ghcr.io/cloudnative-pg/postgresql:15.12 instances: 3 primaryUpdateMethod: switchover storage: size: 100Gi

1
2
3
4
5
Hydrolix recommends defining a `backup` object store for archiving Write-Ahead Logging (WAL) files and backups. Refer to CNPG documentation [Appendix A - Common object stores for backups - CloudNativePG v1.25](https://cloudnative-pg.io/documentation/1.25/appendixes/object_stores/) for details on supported options. 

Hydrolix requires the field `enableSuperuserAccess` to be enabled to log in with the root user and create the additional keycloak and config API databases and users.  

Apply the Cluster object and wait for the status to show as healthy.
kubectl apply -f catalog.yaml kubectl -n "${namespace}" get cluster catalog
## Deploy with External PostgreSQL

The following external Postgres instance exists outside of the Kubernetes cluster that runs Hydrolix. The following describes how to configure an external PostgreSQL instance for a Hydrolix deployment running on GKE.

> ❗️ This Guide Only Applies to New Deployments
> 
> This guide explains how to _initially_ configure a Hydrolix deployment to use an external PostgreSQL instance. To migrate a an existing Hydrolix cluster to an external PostgreSQL instance, see [Migrate to External PostgreSQL](../../tutorials/migrating-postgres.md).

## Create an External PostgreSQL Instance

Use the Google Cloud SQL service to create your external PostgreSQL instance. You can create your instance with high availability, backups, deletion protection and more.

Size your instances based on the criteria specified in [Scale Profiles](../../scale-cluster/manual-scaling/scale-profiles.md). For example, at Mega scale you should provision an instance with 100GB Disk, 6 CPUs, and 24GB of memory.

There is no need to provide your instance with a public IP. To connect with Hydrolix, deploy the PostgreSQL instance within the same Virtual Private Cloud (VPC). For the best performance, colocate the instances within the same region. For more information about connecting Kubernetes with Cloud SQL, see [Google's documentation](https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine#private-ip).

## Define the External PostgreSQL Connection

Disable the internal PostgreSQL instance by setting `scale.postgres.replicas` to `0`. 

Provide values for `catalog_db_admin_user`, `catalog_db_admin_db`, and `catalog_db_host` so your Hydrolix instance can connect to your newly created external PostgreSQL endpoint.

1. Edit your hydrolixCluster resource.
kubectl edit hydrolixcluster hdx -n {namespace}
2. Fill in the values for `catalog_db_admin_user`, `catalog_db_admin_db`, and `catalog_db_host`. Set `scale.postgres.replicas` to `0`.

```yaml values to populate in hydrolixcluster.yaml
spec:
  catalog_db_admin_user: postgres  #<--- Add the admin user "postgres"
  catalog_db_admin_db: postgres    #<--- Add the admin db "postgres"
  catalog_db_host: catalog-rw      #<--- Add the read/write svc endpoint for your catalog cluster

  scale:
    postgres:
      replicas: 0                  #<---- Set the internal postgres to 0!
  scale_profile: prod

Create Secret⚓︎

Store the PostgreSQL secret within a curated Kubernetes secret.

  1. Retrieve the passwords for the PostgreSQL (root) user and the query_api user that were created.

    ROOT_PWD=$(kubectl -n {namespace} get secret catalog-superuser -o jsonpath='{.data.password}' | base64 -D)
    CATALOG_PWD=$(kubectl -n {namespace} get secret catalog-app -o jsonpath='{.data.password}' | base64 -D)
    

  2. Edit the curated secret.

    kubectl edit secret curated - n {namespace}
    

  3. Add property stringData and the entries for both passwords. Kubernetes automatically encodes the passwords from stringData and stores them under the data. When reading the curated secret after storage, the key data will be present, but not the stringData used only for accepting unencoded input. yaml properties to add to curated.yaml stringData: ROOT_DB_PASSWORD: ${ROOT_PWD} CATALOG_DB_PASSWORD: ${CATALOG_PWD}

📘 Already Running Cluster

If you have already deployed to the cluster, use the following command to reset the cluster with your new configuration:

kubectl rollout restart deployment