Skip to content

Multi-Cluster Deployments⚓︎

Multi-cluster deployment distributes Hydrolix components across multiple Kubernetes clusters on Linode infrastructure. This architecture extends Linode's Kubernetes capabilities and enables large-scale data ingestion beyond single-cluster configurations.

What's multi-cluster deployment?⚓︎

Multiple Kubernetes clusters work together as one logical Hydrolix cluster, sharing a catalog database and object storage while maintaining workload isolation for intake, query, and merge operations. In a standard Hydrolix deployment, all components run in a single Kubernetes cluster. For deployments exceeding 250 nodes on Linode, multi-cluster architecture provides enhanced scalability. A multi-cluster deployment separates components across multiple Linode clusters, with each cluster specialized for specific workloads.

How it works⚓︎

Multi-cluster deployments use specialized clusters, shared resources, and cross-cluster communication to operate as a unified system on Linode LKE.

Architecture components⚓︎

A multi-cluster deployment consists of specialized Linode LKE clusters working together:

  • Platform cluster: Hosts the catalog database and manages cluster-wide configuration and metadata. Each Hydrolix cluster runs one set of data lifecycle services, often in the platform cluster. (Always exactly one platform cluster per deployment.)
  • Intake clusters: Perform data ingestion. Typically one Linode cluster for each major data stream. This minimizes partition fragmentation and allows independent scaling.
  • Query clusters: Handle query execution with query heads and query peers. Usually one Linode cluster with multiple query pools inside.
  • Merge clusters: Perform data compaction and optimization. Typically one Linode cluster, and no more than six. The six-cluster maximum accommodates the three merge eras (hot, warm, cold) for both raw tables and summary tables, allowing dedicated resources per era when needed for high-throughput environments. Often colocated with the platform cluster.

Shared resources⚓︎

All Linode clusters in the deployment share three resources:

  • Catalog database: Central metadata and configuration store that all clusters connect to
  • Object storage: S3-compatible storage with a primary bucket for configuration and N data buckets for partitions
  • Keycloak authentication: Unified authentication system across all clusters. While the database is shared, each cluster maintains a unique active cache for Keycloak, which can cause temporary desynchronization when configuration changes occur.

Communication between clusters⚓︎

Clusters communicate through three primary mechanisms on Linode infrastructure:

  • Catalog access: All clusters connect to the platform cluster's catalog database for configuration and metadata
  • Object storage: All clusters read from and write to shared S3-compatible storage for data access
  • API endpoints: Services access each other through HTTPS URLs for cross-cluster communication

Real-world performance⚓︎

A major sports event in 2025 demonstrated multi-cluster capabilities at scale. The deployment ingested 200 TB of log data in 3.5 hours, achieving a peak ingest rate of 17.4 GB/second (equivalent to a 1.4 PB/day pace) while processing 55 billion records with high cardinality.

The multi-cluster architecture delivered 5-10 seconds time to glass from event to analysis capability. The query system handled 55,000 queries with a median response time of 0.481 seconds. Summary tables provided 98.4% row reduction, improving query efficiency. The deployment scaled to 2,400 intake head pods and 85-200 query peers across the clusters, and the fully stateless compute infrastructure scaled down after the event completed.

Next steps⚓︎

To understand when multi-cluster deployment is appropriate for Linode infrastructure, see Requirements and Limitations.

For deployment configuration and setup on Linode, contact Hydrolix support.