Deploy Production PostgreSQL

By default, Hydrolix provisions a single internal PostgreSQL pod to store the catalog. Where production scales are required with the best performance it is suggested to use either an external instance of PostgreSQL or a Kubernetes hosted version of Postgres that is high availability (Crunchydata). This page describes how both these options can used in running a Hydrolix deployment on GKE.

❗️
Potential Unrecoverable Data Loss - Please read.
If you have been loading data and this is a migration, do not proceed unless you fully understand the migration process. Catalog loss can lead to data becoming unrecoverable. To migrate an existing deployment it is strongly suggested to talk to Hydrolix support and review the following page Migrate to External PostgreSQL.

Deploy a Kubernetes HA Postgres

Hydrolix has built in support for the Postgres Kubernetes Operator - (Crunchydata. Crunchy data is supplied externally of Hydrolix and the installation instructions into kubernetes can be found within their documentation here. Using the default install we have found is a good place to start kustomize/install/default

Once Crunchy data is deployed into your Kubernetes cluster the hydrolixcluster.yaml can be edited to add the following into the spec

apiVersion: hydrolix.io/v1
kind: HydrolixCluster
metadata:
  name: hdx
  namespace: .......
spec:
  admin_email: .......
  use_crunchydata_postgres: true
  db_bucket_region: .......
  db_bucket_url: .......
  env: {}
  hydrolix_name: hdx
  hydrolix_url: .......
  ip_allowlist:
  - .........
  scale:
		postgres:
      replicas: 2

By adding the use_crunchydata_postgres: true this will enable Hydrolix to use Crunchy data PostGres.

It is suggested to also update the replica count for PostGres when this is enabled to at least two.

  scale:
		postgres:
      replicas: 2

To confirm you have your new Crunchydata Postgres deployment running you can look for the pods that should have started successfully, they will be named main-main

$ kubectl get pods | grep main-main
main-main-4qjd-0                      4/4     Running     0             60m
main-main-cgxw-0                      4/4     Running     0             60m

Deploy with External PostgreSQL

The following external Postgres instance exists outside of the Kubernetes cluster that runs Hydrolix. The following describes how to configure an external PostgreSQL instance for a Hydrolix deployment running on GKE.

❗️
This Guide Only Applies to New Deployments
This guide explains how to initially configure a Hydrolix deployment to use an external PostgreSQL instance. To migrate a an existing Hydrolix cluster to an external PostgreSQL instance, see Migrate to External PostgreSQL.

Create an External PostgreSQL Instance

Use the Google Cloud SQL service to create your external PostgreSQL instance. You can create your instance with high availability, backups, deletion protection and more.

Size your instances based on the criteria specified in Scale Profiles. For example, at Mega scale you should provision an instance with 100GB Disk, 6 CPUs, and 24GB of memory.

There is no need to provide your instance with a public IP. To connect with Hydrolix, deploy the PostgreSQL instance within the same Virtual Private Cloud (VPC). For the best performance, colocate the instances within the same region. For more information about connecting Kubernetes with Cloud SQL, see Google's documentation.

Create a Hydrolix Cluster Configuration

The hydrolix-cluster command generates the hydrolixcluster.yaml deployment file. We have provided a number of scale profiles for various cloud providers and deployment sizes. Specify a profile using the scale-profile flag. You can also edit the hydrolixcluster.yaml to tune each deployment to your resource requirements. The following command writes the configuration for a dev scale deployment to a file called hydrolixcluster.yaml:

hkt hydrolix-cluster --scale-profile dev --ip-allowlist `curl -s ifconfig.me`/32 > hydrolixcluster.yaml

Add the External PostgreSQL Instance to your Hydrolix Cluster Configuration

Open hydrolixcluster.yaml in a text editor. Edit the values for the following keys:

spec.catalog_db_admin_user
spec.catalog_db_admin_db
spec.catalog_db_host
spec.pg_ssl_mode
spec.scale.postgres.replicas

---
apiVersion: hydrolix.io/v1
kind: HydrolixCluster
metadata:
  name: <NameSpace>
spec:
  admin_email: <admin email>
  db_bucket_url: <bucket path>
  db_bucket_region: <region>
  hydrolix_url: <hostname to use>
  catalog_db_admin_user: postgres  #<--- Add the admin user "postgres" to your config
  catalog_db_admin_db: postgres    #<--- Add the admin db "postgres" to your config
  catalog_db_host: <YOU HOST/IP>   #<--- Add the IP for your cluster
  pg_ssl_mode: <disable/require>   #<--- Set whether you would like to use SSL
  ip_allowlist:
    - 111.222.333.444/32
  scale_profile: dev
  scale:
    postgres:
      replicas: 0                  #<--- Set the internal postgres to 0 to disable it

For example:

---
apiVersion: hydrolix.io/v1
kind: HydrolixCluster
metadata:
  name: myhdxdeployment
spec:
  admin_email: [email protected]
  db_bucket_url: gs://myhdxdeployment
  db_bucket_region: us-central1
  hydrolix_url: http://my.hdxdeploymente.com
  catalog_db_admin_user: postgres
  catalog_db_admin_db: postgres
  catalog_db_host: 11.22.11.22
  pg_ssl_mode: disable
  ip_allowlist:
    - 111.222.333.444/32
  scale_profile: dev
  scale:
    postgres:
      replicas: 0

🚧
Disable the Built-in PostgreSQL Instance
Don't forget to disable the built-in PostgreSQL instance.

Create your Secret

Store your PostgreSQL Secret in a curated secret within Kubernetes:

---
apiVersion: v1
kind: Secret
metadata:
  name: curated
  namespace: <namespace>
stringData:
  ROOT_DB_PASSWORD: <the password to your postgres>
type: Opaque

For example:

---
apiVersion: v1
kind: Secret
metadata:
  name: curated
  namespace: myhdxdeployment
stringData:
  ROOT_DB_PASSWORD: mysupersecretpassword
type: Opaque

Apply your configuration

Run the following commands to deploy this configuration to your Kubernetes cluster:

kubectl apply -f secrects.yaml
kubectl apply -f hydrolixcluster.yaml

📘
Already Running Cluster
If you have already deployed to the cluster, use the following command to reset the cluster with your new configuration:
kubectl rollout restart deployment

Deploy Production PostgreSQL - GKE

❗️
Potential Unrecoverable Data Loss - Please read.

Deploy a Kubernetes HA Postgres

Deploy with External PostgreSQL

❗️
This Guide Only Applies to New Deployments

Create an External PostgreSQL Instance

Create a Hydrolix Cluster Configuration

Add the External PostgreSQL Instance to your Hydrolix Cluster Configuration

🚧
Disable the Built-in PostgreSQL Instance

Create your Secret

Apply your configuration

📘
Already Running Cluster

❗️Potential Unrecoverable Data Loss - Please read.

Deploy a Kubernetes HA Postgres

Deploy with External PostgreSQL

❗️This Guide Only Applies to New Deployments

Create an External PostgreSQL Instance

Create a Hydrolix Cluster Configuration

Add the External PostgreSQL Instance to your Hydrolix Cluster Configuration

🚧Disable the Built-in PostgreSQL Instance

Create your Secret

Apply your configuration

📘Already Running Cluster

❗️
Potential Unrecoverable Data Loss - Please read.

❗️
This Guide Only Applies to New Deployments

🚧
Disable the Built-in PostgreSQL Instance

📘
Already Running Cluster