Deploy Production PostgreSQL
Draft -- still needs work, stub needs to be renamed.
By default, Hydrolix provisions a single internal PostgreSQL pod to store the catalog. Where production scales are required with the best performance it is suggested to use either an external instance of PostgreSQL or a Kubernetes hosted version of Postgres that is high availability (Crunchydata). This page describes how both these options can used in running a Hydrolix deployment on AKS.
Potential Unrecoverable Data Loss - Please read.
If you have been loading data and this is a migration, do not proceed unless you fully understand the migration process. Catalog loss can lead to data becoming unrecoverable. To migrate an existing deployment it is strongly suggested to talk to Hydrolix support and review the following page Migrate to External PostgreSQL.
Deploy a Kubernetes HA Postgres⚓︎
Hydrolix has built in support for the Postgres Kubernetes Operator - (Crunchydata. Crunchy data is supplied externally of Hydrolix and the installation instructions into kubernetes can be found within their documentation here. Using the default install we have found is a good place to start kustomize/install/default
Once Crunchy data is deployed into your Kubernetes cluster the hydrolixcluster.yaml can be edited to add the following into the spec
By adding the use_crunchydata_postgres: true this will enable Hydrolix to use Crunchy data PostGres.
It is suggested to also update the replica count for PostGres when this is enabled to at least two.
To confirm you have your new Crunchydata Postgres deployment running you can look for the pods that should have started successfully, they will be named main-main
Deploy with External PostgreSQL⚓︎
The following external Postgres instance exists outside of the Kubernetes cluster that runs Hydrolix. The following describes how to configure an external PostgreSQL instance for a Hydrolix deployment running on GKE.
This Guide Only Applies to New Deployments
This guide explains how to initially configure a Hydrolix deployment to use an external PostgreSQL instance. To migrate a an existing Hydrolix cluster to an external PostgreSQL instance, see Migrate to External PostgreSQL.
Create an External PostgreSQL Instance⚓︎
Use the Azure Database service to create your external PostgreSQL instance. You can create your instance with high availability, backups, deletion protection and more.
Size your instances based on the criteria specified in Scale Profiles. For example, at Mega scale you should provision an instance with 100GB Disk, 6 CPUs, and 24GB of memory.
There is no need to provide your instance with a public IP. To connect with Hydrolix, deploy the PostgreSQL instance within the same cloud network. For the best performance, colocate the instances within the same region. For more information about connecting Kubernetes with Cloud SQL, see Google's documentation.
Create a Postgres Database⚓︎
Create a Hydrolix Cluster Configuration⚓︎
The hydrolix-cluster command generates the hydrolixcluster.yaml deployment file. We have provided a number of scale profiles for various cloud providers and deployment sizes. Specify a profile using the scale-profile flag. You can also edit the hydrolixcluster.yaml to tune each deployment to your resource requirements. The following command writes the configuration for a dev scale deployment to a file called hydrolixcluster.yaml:
Add the External PostgreSQL Instance to your Hydrolix Cluster Configuration⚓︎
Open hydrolixcluster.yaml in a text editor. Edit the values for the following keys:
spec.catalog_db_admin_userspec.catalog_db_admin_dbspec.catalog_db_hostspec.pg_ssl_mode-
spec.scale.postgres.replicas
For example:
Disable the Built-in PostgreSQL Instance
Don't forget to disable the built-in PostgreSQL instance.
Create your Secret⚓︎
Store your PostgreSQL Secret in a curated secret within Kubernetes:
For example:
Apply your configuration⚓︎
Run the following commands to deploy this configuration to your Kubernetes cluster:
Already Running Cluster
If you have already deployed to the cluster, use the following command to reset the cluster with your new configuration: