Deploy Production PostGres - EKS
By default, Hydrolix provisions a single internal PostgreSQL pod to store the catalog. Where production scales are required use either an external instance of PostgreSQL or a Kubernetes hosted, high-availability PostgreSQL like CloudNativePG v1.25.
This page describes how either CloudNativePG or RDS can be used in running a Hydrolix deployment on EKS.
Potential Unrecoverable Data Loss - Please read.
If you have been loading data and this is a migration, don't proceed unless you fully understand the migration process. Catalog loss can lead to data becoming unrecoverable. To migrate an existing deployment it's strongly suggested to talk to Hydrolix support and review the following page Migrate to External PostgreSQL.
Deploy High-Availability PostgreSQL in Kubernetes
Hydrolix recommends using CloudNativePG (CNPG) when managing your own postgres cluster in Kubernetes. CNPG is external to Hydrolix. Review Installation and upgrades - CloudNativePG v1.25 for detailed instructions on installing CNPG.
Alternatively, you can set up an RDS instance in the same account as your EKS cluster.
After installing the CNPG operator, create a new PostgreSQL Cluster in the same namespace as your HydrolixCluster.
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: catalog
namespace: {namespace}
spec:
backup: {backup} # -- Specify the object storage path here
bootstrap:
initdb:
database: catalog
owner: query_api
enableSuperuserAccess: True
imageName: ghcr.io/cloudnative-pg/postgresql:15.12
instances: 3
primaryUpdateMethod: switchover
storage:
size: 100Gi
Hydrolix recommends defining a backup object store for archiving Write-Ahead Logging (WAL) files and backups. Refer to CNPG documentation Appendix A - Common object stores for backups - CloudNativePG v1.25 for details on supported options.
Hydrolix requires the field enableSuperuserAccess to be enabled to log in with the root user and create the additional keycloak and config API databases and users.
Apply the Cluster object and wait for the status to show as healthy.
kubectl apply -f catalog.yaml
kubectl -n "${namespace}" get cluster catalog
Configure RDS PostgreSQL
The RDS PostgreSQL instance needs to be in the same account, VPC & AWS region as your EKS cluster. You will need this information to complete the creation on the instances.
Run the following command to find the VPC ID of your cluster.
aws eks describe-cluster --name ${HDX_KUBERNETES_NAMESPACE} --query 'cluster.resourcesVpcConfig.vpcId'
-
In the AWS console, switch to the RDS service and click create Database. Select PostgreSQL and Engine version 11.12-R1 or greater.

-
Select the "Production" template. Choose the "Multi-AZ DB instance" availability and durability option. Enter a database name.

-
Supply a root username and password; Hydrolix uses these to access the database.

-
In the storage section, select the general purpose gp3 storage type with 100 GiB of allocated storage.

-
In the connectivity section, select the VPC ID associated with your EKS cluster. The dropdown lists all VPC in the region as: VPC Tag name (VPC ID). Select the default DB Subnet group.
If you don't know your VPC ID, see the command provided at the beginning of this guide.

-
Select both EKS cluster & node security groups from the
Existing VPC security groupsdropdown list. Use the default certificate authority and password authentication.
-
Disable the following settings by unchecking their checkboxes:
Turn on Performance InsightEnable auto minor version upgrade.
-
Click
Create Databaseto confirm your settings.
It takes about 10 minutes to create your database. When ready, AWS provides an endpoint to connect to your database. Find this endpoint in the Connectivity & security tab of the database details page. Use this endpoint as the catalog_db_host in the next step.
Define the External PostgreSQL Connection
Disable the internal PostgreSQL instance by setting scale.postgres.replicas to 0.
Provide values for catalog_db_admin_user, catalog_db_admin_db, and catalog_db_host so your Hydrolix instance can connect to your newly created external PostgreSQL endpoint.
- Edit your hydrolixCluster resource.
kubectl edit hydrolixcluster hdx -n {namespace}
- Fill in the values for
catalog_db_admin_user,catalog_db_admin_db, andcatalog_db_host. Setscale.postgres.replicasto0.
spec:
catalog_db_admin_user: postgres #<--- Add the admin user "postgres"
catalog_db_admin_db: postgres #<--- Add the admin db "postgres"
catalog_db_host: catalog-rw #<--- Add the read/write svc endpoint for your catalog cluster
scale:
postgres:
replicas: 0 #<---- Set the internal postgres to 0!
scale_profile: prod
Create Secret
Store the PostgreSQL secret within a curated Kubernetes secret.
- Retrieve the passwords for the PostgreSQL (root) user and the query_api user that were created.
ROOT_PWD=$(kubectl -n {namespace} get secret catalog-superuser -o jsonpath='{.data.password}' | base64 -D)
CATALOG_PWD=$(kubectl -n {namespace} get secret catalog-app -o jsonpath='{.data.password}' | base64 -D)
- Edit the
curatedsecret.
kubectl edit secret curated - n {namespace}
- Add property
stringDataand the entries for both passwords. Kubernetes automatically encodes the passwords fromstringDataand stores them under thedata. When reading thecuratedsecret after storage, the keydatawill be present, but not thestringDataused only for accepting unencoded input.
stringData:
ROOT_DB_PASSWORD: ${ROOT_PWD}
CATALOG_DB_PASSWORD: ${CATALOG_PWD}
Already Running Cluster
If your cluster is already running, run the following command to redeploy the cluster with these settings applied:
kubectl rollout restart deployment
Updated 4 days ago