Prepare a Cluster
This guide shows how to configure Amazon Elastic Kubernetes Service (EKS) to host a Hydrolix cluster. This uses Amazon's CLI tools to build and configure the cluster.
Terraform Examples
For examples of configuring an EKS environment with Terraform, see the hydrolix/terraform GitHub repository.
Prepare an EKS Cluster⚓︎
Setup⚓︎
Prepare Command Line Tools⚓︎
Install and configure the following tools on your local machine:
awsto interact with the Amazon serviceseksctlto build the EKS clusterkubectlto interact with the cluster
Create a Local Environment Variables File⚓︎
This guide uses environment variables as a templating mechanism. We recommend putting them into a file so you can load them into scope within your terminal shell. Write the following environment variables into a file called env.sh, replacing <> with appropriate values for your deployment:
Next, run the following command to bring the variables into scope:
Configure the Cluster⚓︎
Create an EKS Cluster⚓︎
We use eksctl to build the EKS cluster. With eksctl, you can define scale and node group definitions in a YAML file. Because Hydrolix contains a few StatefulSet deployments, you must add the Amazon Elastic Block Store (Amazon EBS) Container Storage Interface (CSI) driver in the addons section. To create your configuration file, write the following to a file named eksctl.yaml:
Instance Type
The choice of instanceType depends on your needs. We strongly recommend the c5n compute range due to its network bandwidth guarantees. c5n.4xlarge works well for for development clusters. For production use cases, use c5n.9xlarge at minimum.
Use the following command to replace the environment variables above with their values:
Run the following command to create a cluster based on your configuration file:
Sharing an existing VPC
The above command creates a new VPC. Amazon also provides an extended syntax for reusing an existing VPC available here and detailed in the full schema definition. Please review Amazon EKS VPC requirements document when selecting a VPC to join.
This step can take several minutes to complete. Thankfully, it provides a lot of progress updates in the terminal.
Create an S3 Bucket & IAM Policy⚓︎
Hydrolix stores your data in cloud storage. For this guide, we'll use the same region as the cluster you created in the previous step and use the same name as your namespace. Run the following command to create an S3 bucket for Hydrolix data storage:
To enable Hydrolix access to this bucket, associate it with an IAM policy. Run the following command to define the permissions required for Hydrolix:
Then, run the following command to create the IAM policy associated with the permissions you just defined:
Create an IAM Policy for Service Accounts⚓︎
Hydrolix service accounts interact with the cluster using AssumeRoleWithWebIdentity. This is a session token based mechanism managed by an OpenID Connect (OIDC) provider.
When you create the cluster via eksctl, Amazon automatically enables an IAM OICD provider. Run the following command to access the information needed to connect to this provider:
Add the OIDC_PROVIDER environment variable to your env.sh script so it's available whenever you administrate your Hydrolix cluster.
Run the following command to define the OIDC managed service account policies:
Run the following command to create a role using this policy:
Finally, attach the service account IAM policy to the service account IAM role.
Create Namespace & GP3 Performance Disks⚓︎
It's best to deploy Hydrolix in its own namespace. Run the following command to create that namespace:
Next, set the Hydrolix namespace as the default namespace in kubectl:
Hydrolix deploys StatefulSet infrastructure, which benefits greatly from high performance EBS storage due to the volume of data Hydrolix processes. Let's define a collection of high performance GP3 disks:
Finally, run the following command to create the GP3 disks in your cluster:
Generate operator config⚓︎
The Hydrolix operator resources API generates all of the Kubernetes resource definitions required to deploy the operator, including service accounts and role permissions. Once deployed, the operator manages your Hydrolix cluster deployment. To upgrade your deployment to a new version, repeat this step.
Run the following command to generate the operator YAML file, named operator.yaml:
Configure Cluster Autoscaling⚓︎
Deploy the Metric Server⚓︎
Autoscaling requires a metrics server. Use the URL endpoint to deploy a metrics server in your cluster:
Create Autoscaler Node Group Policy⚓︎
You must provide the autoscaler with all the autoscale permissions in the IAM policy for the nodes in your namespace. Run the following command to define the permissions:
Run the following command to create a policy for the autoscaler using those permissions:
Create Autoscaler Service Account Permissions⚓︎
Next, run the following command to define a role using the autoscaler policy you just created:
Then, create the role using the role definition:
Finally, attach the role to the policy:
Deploy Cluster Autoscaler Autodiscovery⚓︎
Run the following command to download the cluster autoscaler autodiscovery configuration into a file named cluster-autoscaler-autodiscover.yaml:
Replace the placeholder <YOUR CLUSTER NAME> with your actual namespace. If you've loaded your environment variables defined in env.sh, you can replace it with the following command:
Otherwise, you can manually replace <YOUR CLUSTER NAME> in cluster-autoscaler-autodiscover.yaml with a text editor.
Then apply the autoscaler autodiscovery configuration changes to your cluster with kubectl:
As a final step, annotate the cluster autoscaler service account:
Congratulations! You are now ready to deploy Hydrolix on Amazon EKS. Proceed to the next step to get Hydrolix running.