Scale your Cluster
Scale Hydrolix services with Kubernetes
Overview
Scale a component's resources or replica count in the scale section of your hydrolixcluster.yaml
file.
Hydrolix supports both stateful and stateless components, and scaling these requires different considerations.
Use the pre-made Hydrolix scale profiles that work for various throughput levels. For example, scale_profile: prod
provides components scaled for a typical 1-4 TB/daily workload.
Override a component’s scale profile by setting the scale field in hydrolixcluster.yaml
.
Deprecation notice
The stream-head
and stream-peer
services are deprecated, and were replaced with intake-head
.
Stateful components
These components have a data_storage
scale key:
Service | Description |
---|---|
postgres | Core |
prometheus | Reporting and Control |
rabbitmq | RabbitMQ |
redpanda | Redpanda |
zookeeper | Core |
Stateless components
These components all have cpu
, memory
and storage
scale keys along with replicas
:
Service | Description | Service | Description |
---|---|---|---|
alter-peer | Alter Jobs | query-head | Query |
batch-head | Ingest | query-peer | Query |
batch-peer | Ingest | reaper | Data Lifecycle |
decay | Data Lifecycle | stream-head | Ingest |
intake-api | Ingest | stream-peer | Ingest |
kafka-peer | Ingest | turbine-api | Query |
keycloak | Core | traefik | Core |
merge-head | Merge | version | Core |
merge-peer | Merge | zookeeper | Core |
Configure scaling
Edit the hydrolixcluster.yaml
file to add or override component scale profiles.
kubectl edit hydrolixcluster --namespace="$HDX_KUBERNETES_NAMESPACE"
Stateful Persistent Volume changes
Persistent volume storage can only be increased, not decreased.
Scale pod settings
Use these settings to set the resource values of a pod:
Value | Description | Example |
---|---|---|
cpu | Amount of CPU to use for the pod/container | cpu: 2 cpu: 2.5 |
memory | Amount of RAM to use for the pod/container | memory: 500Gi |
storage | Amount of ephemeral storage to be used: note this type of storage is not stateful | storage: 10Gi |
data_storage | To scale the PVC for pods that support it, use the data_storage key | data_storage: 1TB |
When specified, usually sets both request and limit for the specified resource (memory, CPU, storage, data_storage). Use overcommit
or limit_cpu
tunables for more flexibility. See HTN tunables for more information.
Configure single pods
Modify single pod settings in the hydrolixcluster.yaml
file.
For example, this setting modifies the intake-head
pods to have two CPUs and 10GiB of RAM allocated:
scale:
intake-head:
cpu: 2
memory: 10Gi
Configure multi-container pods
Some Hydrolix services run as pods with multiple containers. For example, the stream peer service contains both the intake-head
and turbine
containers. The turbine
container is the indexer component that executes transforms and indexes content.
Settings applied to the default intake-head
service don't apply to the turbine
container.
Use the <component>-indexer
name to specify the turbine
component in your hydrolixcluster.yaml
file. For example, for intake
, use intake-indexer
.
See Scale Profiles for a list of pods.
Set scale profiles
Specify a scale_profile:
key in your hydrolixcluster.yaml
file with a value of prod
or mega
.
-
prod - A fully resilient production deployment: 1-4 TB/day
-
mega - A fully resilient large-scale production deployment: 10-50 TB/day
In this example, the scale_profile
is set to prod
:
apiVersion: hydrolix.io/v1
kind: HydrolixCluster
metadata:
name: hdxcli-xxxyyyy
namespace: hdxcli-xxxyyyy
spec:
admin_email: [email protected]
kubernetes_namespace: hdxcli-xxxyyyy
kubernets_profile: gcp
env:
EMAIL_PASSWORD:
hydrolix_url: https://host.hydrolix.net
db_bucket_region: us-central1
scale_profile: prod <--- For the Prod Profile
Apply the changes to automatically scale the system:
kubectl apply -f
See Scale Profiles to learn more.
Override a scale profile
To override component scale settings, add more instances for components, or increase resources, use the scale:
section of your hydrolixcluster.yaml
file.
The prod
scale profile provides two GiB memory and one replica.
This example scales the batch-peer component, with five instances and more memory than the scale profile provides.
Edit the hydrolixcluster.yaml
to add this override:
.....
spec:
.....
scale_profile: prod
scale:
batch-peer:
memory: 5G
replicas: 5
.....
Apply your changes:
kubectl apply -f hydrolixcluster.yaml && kubectl -n $HDX_KUBERNETES_NAMESPACE rollout restart deployment operator
Services with PVC Storage
Some of the services Hydrolix uses need to maintain state in order to provide high availability and redundancy. The
postgres
service uses PVC storage. Specify storage using thedata_storage
key.PVC changes can have significant impact. Contact Hydrolix Support if you have any questions.
Scale to zero
To autoscale everything in a cluster off except for the operator pod, add this line to the top level spec
:
scale_off : true
Only the operator pod remains running, to allow scaling back up.
...
kubernets_profile: gcp
hydrolix_url: https://host.hydrolix.net
env:
EMAIL_PASSWORD:
db_bucket_region: us-central1
scale_off: true <--- Turn everything off
It takes a few minutes for all components to scale down.
To scale back up, remove scale_off : true
and apply the changes to thehydrolixcluster.yaml
file.
Custom autoscaling with Prometheus metrics
The hdx-scaler, or autoscaler, provides Kubernetes cluster scaling using Prometheus metrics with low resource overhead.
Use metrics from other services to scale up
The autoscaler uses external metrics to decide when to scale up a scaled-to-zero service. If no separate app metric is specified, the scaler sets the minimum replica to 1
instead of 0
.
Use precision
to set the scale ratio
precision
to set the scale ratioThe precision
configuration sets the number of digits to round to when calculating the average-to-target ratio. The default is 10. A higher precision
number smooths the transitions when scaling up and down.
Use precision
to set when the ratio rounds to zero to trigger the desired number of pods to reach zero. For more frequent scaling to zero, set a lower precision. Set a higher precision value to keep small ratios above zero and have less frequent scaling to zero.
- A ratio of
0.045
with precision ≤1 rounds to zero, scaling down to zero pods. - A ratio of
0.045
with precision ≥2 rounds up and keeps one replica active.
In this example, merge_duty_cycle
, part of merge-controller
, is the metric that determines when merge-peer
scales up, if it's been set to a low or zero precision
value.
merge-peer:
cpu: 4
memory: 4Gi
hdxscalers:
- metric: merge_duty_cycle
port: 27182
target_value: 0.5
cool_down_seconds: 40
app: merge-controller
precision: 1
replicas: 1-5
scale_profile: I
service: merge-peer
Scale to minimal
This feature was added in version 5.3.
Use scale to minimal to autoscale most components to zero while leaving the cluster available for API calls and the UI. This feature provides extra cost savings by reducing resources used for idle workloads, while still allowing API interactions and scaling up when needed. This setting is off by default.
Scale to minimal is most effective for less-frequently used components, as there is a delay before scaling back up.
Enable scale to minimal
To enable scale to minimal, edit the hydrolixclusterconfig.yaml
file and add scale_min: true
to the spec: section.
Active services when scaled to minimal
These pods remain active when setting scale_min: true
:
traefik
ui
turbine-api
turbine-api-worker
zookeeper
keycloak
prometheus
validator
version
All other pods scale to zero when idle.
Updated 3 days ago