Skip to content

Overview

This feature was introduced in Hydrolix version 5.11.

The Hydrolix operator can deploy an optional kube-scheduler and descheduler to optimize node utilization. The scheduler decides which node each new pod lands on by scoring candidate nodes using a bin-packing strategy. The descheduler evaluates already-running pods on a recurring interval and evicts those on under- or over-utilized nodes so the scheduler can repack them.

How the scheduler and descheduler work together⚓︎

When both components are enabled, they work as a coordinated loop. The specific flow depends on which strategies you choose:

The scheduler, descheduler, and the Kubernetes Cluster Autoscaler work together to reduce node count:

  1. The scheduler, configured with MostAllocated, assigns new pods to the busiest nodes that can accommodate them, keeping pods packed tightly.
  2. The descheduler periodically identifies nodes where utilization has dropped below the configured thresholds and evicts their pods.
  3. The scheduler repacks the evicted pods onto busier nodes.
  4. The Cluster Autoscaler detects the empty nodes and removes them, reducing cloud costs.

The scheduler and descheduler work together to balance workload distribution:

  1. The scheduler, configured with LeastAllocated, assigns new pods to the least-loaded nodes that can accommodate them, distributing pods evenly across the cluster.
  2. The descheduler periodically identifies nodes whose utilization has risen above the configured target thresholds and evicts their pods.
  3. The scheduler redistributes the evicted pods onto less-loaded nodes.
  4. Workload stays balanced across the cluster, reducing hot spots and resource contention.

When to use this feature⚓︎

This feature serves two distinct goals.

  • For cost-driven deployments on managed Kubernetes services with node autoscaling, enable the scheduler with MostAllocated and the descheduler with HighNodeUtilization to consolidate pods onto fewer nodes and let the cluster autoscaler reclaim the empty ones.
  • For deployments that prioritize pod isolation or even resource distribution, enable the scheduler with LeastAllocated and the descheduler with LowNodeUtilization to spread pods across nodes. The default Kubernetes scheduler also spreads pods at placement, but only the Hydrolix scheduler and descheduler rebalance them over time.

Enabling the descheduler doesn't disrupt live workloads. During eviction cycles under sustained ingest and query load, ingest and query continue without dropped data, HTTP errors, or significant latency spikes. The PodDisruptionBudgets (PDBs) the operator creates cap how many replicas of the same service the descheduler can disrupt at once. Services with a single replica receive no PDB; in that case, eviction is gated by the descheduler's default evictor (see Pods protected from eviction).

Database pods are managed by a separate operator and aren't affected by the Hydrolix scheduler or descheduler.

Prerequisites⚓︎

  • CLI deployments must set kubernetes_version in the spec (for example, kubernetes_version: v1.31.0) when scheduler.enabled: true. The operator uses this value to pin the matching kube-scheduler image, and autodiscovers it for operator-managed deployments. This isn't required when using an external scheduler.
  • The Kubernetes Cluster Autoscaler isn't required but is recommended for cost-optimization deployments. Without it, the scheduler consolidates pods but the resulting empty nodes aren't removed, so cloud cost savings don't appear at the node-pool level.

Interactions with other Hydrolix features⚓︎

Pod anti-affinity rules⚓︎

The Hydrolix scheduler respects pod anti-affinity rules. The Kubernetes scheduling pipeline runs in two phases: filtering and scoring. Strict anti-affinity constraints filter the set of candidate nodes first, and the scheduler's scoring strategy then scores only the remaining eligible nodes. See the Kubernetes scheduler documentation for more detail.

Pod priority classes⚓︎

The descheduler evicts pods regardless of pod priority class. Hydrolix doesn't configure a priorityThreshold in the descheduler's evictor, so pod priority doesn't prevent eviction. Priority class assignments still influence preemption and scheduling order.

Hydrolix pod autoscaler (hdx-scaler)⚓︎

The descheduler respects each service's PDB maxUnavailable budget when evicting pods. Non-Ready replicas count against that budget, regardless of cause. While hdx-scaler is scaling a service, the descheduler can't evict from it until enough replicas return to Ready. Raising max_unavailable above 1 loosens this guard.

Node targeting⚓︎

The Hydrolix scheduler composes cleanly with node targeting. A service that sets node_selector, node_affinity, or node_name has its candidate nodes filtered by those constraints before the scheduler's scoring strategy runs, in the same filtering-then-scoring pipeline described under Pod anti-affinity rules.

Where to go next⚓︎

  • To enable and configure the bin-packing scheduler, see Scheduler.
  • To enable and configure the descheduler, see Descheduler.