Overview
Hydrolix includes an automated compaction and optimization Merge service as part of its data lifecycle. This service is enabled by default for all tables.
The Merge service combines small partitions into larger ones, improving compression efficiency and decreasing partition count. This results in better-performing queries and a smaller storage footprint for the same data.
This diagram shows data partitions in storage. During the merge process, three smaller partitions are combined into one larger partition with a longer time interval.
For more information about how the Merge service fits into the Hydrolix architecture, see the Merge platform overview.
Merge controller⚓︎
merge-controller was introduced in Hydrolix version 5.3.0 and became the default merge service in v5.10, replacing merge-head. It adds operational, observability, and performance improvements over merge-head.
%%{init: {
'theme': 'base',
'themeVariables': {
'background': 'transparent',
'fontSize': '18px',
'edgeLabelBackground': 'transparent',
'primaryColor': 'transparent',
'primaryBorderColor': '#003D66',
'primaryTextColor': '#424D57',
'lineColor': '#003D66'
},
'flowchart': { 'padding': 20, 'nodeSpacing': 60, 'rankSpacing': 60, 'curve': 'basis' }
}}%%
flowchart LR
CAT[(Catalog)]
MC[merge-controller]
P[merge-peer pools]
S[(Storage)]
CAT -->|partition stream| MC
MC -->|candidates over gRPC| P
P -->|R/W| S
P -->|completion report| MC
style CAT fill:#F4F6F8,stroke:#003D66,stroke-width:2px,color:#003D66
style MC fill:none,stroke:#003D66,stroke-width:2px,color:#424D57
style P fill:none,stroke:#00A99D,stroke-width:2px,color:#035F60
style S fill:#F4F6F8,stroke:#003D66,stroke-width:2px,color:#003D66
%%{init: {
'theme': 'base',
'themeVariables': {
'background': 'transparent',
'fontSize': '18px',
'edgeLabelBackground': 'transparent',
'primaryColor': 'transparent',
'primaryBorderColor': '#88C9F2',
'primaryTextColor': '#FAFBFC',
'lineColor': '#88C9F2'
},
'flowchart': { 'padding': 20, 'nodeSpacing': 60, 'rankSpacing': 60, 'curve': 'basis' }
}}%%
flowchart LR
CAT[(Catalog)]
MC[merge-controller]
P[merge-peer pools]
S[(Storage)]
CAT -->|partition stream| MC
MC -->|candidates over gRPC| P
P -->|R/W| S
P -->|completion report| MC
style CAT fill:#092747,stroke:#6EF6D9,stroke-width:2px,color:#6EF6D9
style MC fill:#092747,stroke:#88C9F2,stroke-width:2px,color:#88C9F2
style P fill:#092747,stroke:#00A99D,stroke-width:2px,color:#6EF6D9
style S fill:#092747,stroke:#6EF6D9,stroke-width:2px,color:#6EF6D9
merge-head is deprecated
merge-head is deprecated as of v5.10 and will be removed in v5.12. If your configuration explicitly uses merge-head, migrate to merge-controller before upgrading to v5.12.
Apply these changes to your hydrolixcluster.yaml:
Apply these changes in a single update:
- The
merge-headpod shuts down - A
merge-controllerpod starts up - All
merge-peerpods restart in all pools
Migration has minimal impact on a live cluster. Merge operations are temporarily delayed while the merge peers restart.
Singleton enforcement and resilience⚓︎
merge-controller runs as a singleton. On startup, it checks these conditions and refuses to bootstrap if either is true:
- There are more than 0
merge-headpods currently running - There are more than 0
merge-controllerpods currently running.
Running more than one merge service at once won't damage the cluster or your data, but it's inefficient and may cause pods to enter a CrashLoopBackoff state, breaking merge functionality. merge-controller ensures only one merge service runs at a time.
merge-controller is resilient to normal process terminations, such as manual restarts and scale changes, and to abnormal terminations, such as pod evictions and OOM kills. merge-peer pods continue any merge operations in progress when merge-controller terminates, but don't receive new work until merge-controller returns. merge-peer pods repeatedly try to reconnect to merge-controller. After reconnecting, merge peers report their current status so merge-controller can reconstruct its state.
Memory and performance⚓︎
While merge-controller uses more memory than merge-head to maintain an instantaneously accurate view of the entire merge subsystem, the cluster's overall CPU and storage access demands decrease with merge-controller.
Merging partitions is a bin-packing exercise. The goal is to combine sets of smaller partitions with varying sizes into the fewest number of larger partitions without exceeding fixed size limits. The bin-packing algorithm in merge-controller, using an in-memory approach and a first-fit or best-fit strategy, significantly improves efficiency over merge-head. Removing the intermediate queue and using direct gRPC connections between merge peers and merge-controller also improve performance.
In general, merge-controller uses more memory when:
- A large number of active merges
- Many tables with high ingest volume. Higher ingest creates more partitions, which increases merge counts.
The impact of these variables is typically small but may be detectable.
Performance tunables⚓︎
The merge controller determines which existing partitions to combine into new, better-organized partitions. These lists of partitions are called candidates. Use these tunables to control resources for building candidates.
merge_max_candidates limits the number of candidates awaiting dispatch to merge peers. If set too low relative to the number of merge peers in a pool, throughput may drop. Higher values can improve performance but increase memory usage. Hydrolix recommends setting this to 500.
merge_max_partitions_per_candidate limits the number of partitions merged together in a single operation. Higher values allow more partitions per candidate, but may impact turbine's merge capacity.
Increasing these tunables can improve merge throughput at the cost of higher memory usage.
| Example of increased tunable limits for merge-controller | |
|---|---|
Since higher values increase memory usage, monitor pod restarts for OOM kills and watch for usage approaching pod limits.
Metric candidates and autoscaling⚓︎
The merge controller exposes a Prometheus gauge candidates counting the number of partition groups waiting to be dispatched to merge peers. The metric is labeled per project, table, pool, and target. The target label maps to the merge peer era, such as I, II, or III.
Construction limit⚓︎
The merge controller limits memory usage by pausing construction of new candidates once its buffer reaches merge_max_candidates. For this reason, the candidates metric plateaus at the limit of merge_max_candidates. If the HDX Autoscaler scales merge peers based on this metric, it can't see demand beyond the plateau and won't provision additional pods. To set an alert that checks the candidates metric against merge_max_candidates, review Configure Alerts.
Hydrolix recommends setting merge_max_candidates to 500.
Higher candidate count increases merge controller memory usage.
Monitor for OOM kills after increasing this value.
If the candidates metric is persistently at merge_max_candidates, check the current replica count for the affected pool and apply the appropriate fix:
- Replicas are below the configured maximum. Reduce
target_valuein the pool'shdxscalersconfiguration. A lower target value causes the autoscaler to request more replicas for the same metric value. For example, loweringtarget_valuefrom 50 to 25 doubles the targeted replica count. - Replicas are already at the configured maximum. Increase
maxin the pool'shdxscalersconfiguration to give the autoscaler room to add pods.
Once the metric drops below the plateau, return target_value and replica limits to their normal range.
Tune the autoscaler aggregation⚓︎
The HDX Autoscaler can scale merge-peer replicas based on the candidates metric. Configure this in the hdxscalers section of a service or pool. Set the op parameter to control how multiple samples across tables and targets are aggregated. Supported values are sum, avg, min, and max. If op isn't set, the autoscaler uses only the first sample it encounters, which may undercount demand. For clusters with multiple high-volume tables, sum or max is best.
This example scales the merge-peer-iii pool between 1 and 10 replicas, targeting 50 candidates across all pods, with sum aggregating across all tables and targets.
| Example Autoscaler Config for Merge Peer | |
|---|---|
Use this query to monitor for saturation.
OOM recovery and self-tuning⚓︎
Merge peer pods can be OOM-killed when a merge candidate requires more memory than the container limit allows. When this happens, the partitions from the failed merge become eligible again and are included in future candidates. merge-controller also tracks actual memory usage per merge and adjusts future estimates using an exponentially weighted moving average. Over time this reduces the likelihood of OOM kills for similar workloads. After a cluster first encounters OOM events for a given table, allow several merge cycles before intervening.
If a merge peer pool shows persistent OOM restarts that don't resolve after several cycles, the container memory limit likely needs to be increased. See Troubleshoot merge peer OOM for steps to identify which container is affected and how to apply the fix.
Scale horizontally or vertically⚓︎
When merge is falling behind, the right response depends on the symptom.
Scale horizontally by adding merge peer replicas when the candidates metric is consistently high or saturated. This means there's enough work for more pods to process in parallel.
Scale vertically by increasing memory per pod when individual merge peers are OOM-killed persistently. This means the partitions being merged are too large for the current memory limit. See Troubleshoot merge-peer OOM for steps to diagnose and resolve this.
Disable merge on tables⚓︎
All tables have merge enabled by default. Disable and re-enable merge with the PATCH table API endpoint. Disabling merge stops new merge jobs immediately, but jobs already in the queue complete.
Disable merge only under special circumstances
Disabling merge isn't recommended, and may result in performance degradation.
For example, this API request enables merge for a given table:
To disable merge in the Hydrolix UI, navigate to Data, select the table, find Merge settings under Advanced options, select the three dots on the right of that row, and select Disable Merge.

Merge pools⚓︎
Hydrolix clusters create merge components in three pools: small, medium, and large. These three sizes each handle different partitions that are differentiated by several criteria. This ensures optimal partition sizing and spreads merge workloads across old and new data.
This table shows the criteria used to assign partitions to merge pools:
| If the max Primary Timestamp is: | ...and the size is within: | ...and the time width is within: | Resulting Merge Pool |
|---|---|---|---|
| Under 10 minutes old | 1 GB | 1 hour | small (merge-i) |
| Between 10 minutes and 1 hour old | 2 GB | 1 hour | medium (merge-ii) |
| Between 1 hour and 90 days old | 4 GB | 1 hour | large (merge-iii) |
For example, a partition with a last timestamp 15 minutes ago, a size of 513 MB, and a width of 37 minutes goes to the medium pool.
A 2.5 GB partition isn't eligible for merge until 1 hour after its last timestamp, and goes to the large pool only if other eligible partitions smaller than 1.5 GB exist to merge with.
Partitions older than 90 days aren't considered by default
The merge system looks back only 90 days for partitions eligible for compaction. This limit is configurable through merge_target_overrides.
📘 Primary timestamp For more information on primary timestamps, see Timestamp Data Types.
Custom merge pools⚓︎
To separate merge workloads and avoid “noisy neighbor” effects, create additional merge pools targeted at specific tables. For example, create a dedicated merge pool for a Summary Table to separate that workload from the main merge process.
Create custom merge pools with the pools API endpoint, then apply those pools to tables with the tables API endpoint.
Create pools⚓︎
This Config API command creates a custom pool using the pools API endpoint:
In the Hydrolix UI, select Add new from the upper right-hand menu, then select Resource pool.

Use these settings to configure your pool:
| Object | Description | Value/Example |
|---|---|---|
service |
The service workload the pool uses. For merge, this is merge-peer. |
merge-peer |
scale_profile |
The merge pool size, corresponding to small, medium, or large. | I, II or III |
name |
The name used to identify your pool. | Example: my-pool-name-II |
cpu |
The amount of CPU provided to pods. | A numeric value, defaults are specified in Scale Profiles. Example : 2 |
memory |
The amount of memory provided to pods. | A string value, defaults are specified in Scale Profiles. Default units are Gi. Example:10Gi |
replicas |
The number of pods to run in the pool. | A numeric value or hyphenated range. Defaults are specified in Scale Profiles. Examples: 3 and 1-5 |
storage |
The amount of ephemeral storage provided to pods. | A string value, defaults are specified in Scale Profiles. Default units are Gi. Example: 5Gi |
Assign pools to tables⚓︎
This API request assigns custom pools to a table using the tables API endpoint:
To configure this in the Hydrolix UI, navigate to Data, select the table, find Merge settings under Advanced options, and select the pool assignment menu:

Use all three pools
For optimal merge performance, provide a large, medium, and small pool.
Troubleshoot merge-peer OOM⚓︎
As described in OOM recovery and self-tuning, sporadic OOM kills don't require action. Follow these steps when a merge peer pool shows persistent OOM restarts that don't resolve on their own.
Each merge-peer pod runs two main containers: the primary merge-peer container and a secondary merge-indexer container, which is the turbine sidecar that builds indexes for merged partitions. Each container has its own memory limit and they're configured independently.
Identify which container is being OOM-killed⚓︎
When a merge peer pod is OOM-killed, first determine which of the two containers hit its memory limit:
The output shows the terminated container name and reason. If turbine appears with Reason: OOMKilled, the merge indexer sidecar is the problem, not the primary merge-peer container.
You can also check for recent OOM events across all merge peer pods:
If the OOM-killed container is turbine, increase the merge indexer memory using spec.scale.profile.
If the OOM-killed container isn't turbine and matches the pool name such as merge-peer-iii, increase memory in the pool definition's memory field instead.
Default merge indexer memory by era⚓︎
| Generation | Profile | Default merge-indexer memory |
Default CPU |
|---|---|---|---|
| Era I | I |
4 Gi | 2 |
| Era II | II |
6 Gi | 2 |
| Era III | III |
12 Gi | 2 |
Fix: increase memory on the merge indexer container⚓︎
Override the merge indexer memory through spec.scale.profile. Don't use the pool definition's top-level memory field as that controls the primary merge peer container, not the indexer. The two containers resolve their resources independently.
| Override Era III Merge-Indexer Memory in HydrolixCluster | |
|---|---|
Apply the change:
The operator detects the spec change, updates the deployment, and Kubernetes rolls the affected merge-peer pods with the new resources. No manual kubectl rollout restart is required.
See Custom Scale Profiles for an example of creating a named profile and attaching it to a pool.