v5.5.0
Row-level access control, ip and uuid datatypes, column alias management in UI
Notable new features⚓︎
Row-level access control⚓︎
- Data can now be filtered for users based on the contents of the row and users' assigned roles.
- Each filter is implemented as a row policy.
- RBAC roles control which users are affected by which row policies.
- Learn more in Row-Level Access Control
Column alias management for data tables⚓︎
You can now create and manage column aliases directly from the UI. Tabs are fully synchronized to preserve input when switching.
- Column alias list
- In the UI
data/tables/{table_id}/column-management/: View existing aliases with columns for Alias and Expression - Create new alias
- In the UI
/data/tables/{table_id}/column-management/new-column-alias/- Alias form: Add a name and expression
- Analyze expression:
- Select optional columns to preview with the alias
- See generated SQL, copy it, and set row limits
- Run a test query to preview results
- Edit alias
- In the UI
/data/tables/{table_id}/column-management/{alias_id}/: Edit or delete existing aliases
Support for new ip and uuid datatypes⚓︎
IP addresses and UUIDs may now be used as a first-class datatype in transforms, ingest, and queries.
- The
ipdatatype supports both IPv4 and IPv6 addresses. Here's an example of field definitions in a transform:
uuid datatype supports standard 128-bit UUIDs. Here's an example of a field definition in a transform:
Upgrade instructions⚓︎
⚠️ Do not skip release versions.
Skipping versions during upgrades may result in system instability. You must upgrade sequentially through each release version.
Example: Upgrade from 5.4 → 5.5 → 5.6, not 5.4 → 5.6
- Apply the new Hydrolix operator
If you have a self-managed installation, apply the new operator directly with the kubectl command examples below. If you're using Hydrolix-supplied tools to manage your installation, follow the procedure prescribed by those tools.
### GKE
### EKS
### LKE and AKS
- Monitor the upgrade process
Kubernetes jobs named init-cluster and init-turbine-api will automatically run to upgrade your entire installation to match the new operator's version number. This will take a few minutes, during which time you can observe your pods' restarts with your favorite Kubernetes monitor tool.
Ensure both the init-cluster and init-turbine-api jobs have completed successfully and that the turbine-api pod has restarted without errors. After that, view the UI and use the API of your new installation as a final check.
If the turbine-api pod doesn't restart properly or other functionality is missing, check the logs of the init-cluster and init-turbine-api jobs for details about failures. This can be done using the k9s utility or with the kubectl command:
If you still need help, contact Hydrolix support.
Changelog⚓︎
Updates⚓︎
These changes include version upgrades and internal dependency bumps.
Cluster Operations⚓︎
- Upgraded gunicorn from 20.0.4 -> 20.1.0 to prevent cluster problems. See more information about the bugfix here: https://github.com/benoitc/gunicorn/issues/1913
- Upgraded kopf and update operator RBAC from v0.0.0-20220102113305-2298ace6d09d to v0.0.0-20250630131328-58d95d85e994.
- Upgraded HTTP Proxy to version 0.4.0.
- Switched from
pycryptodometopycryptodomexversion3.23.0to eliminate a security vulnerability.
Intake and Merge⚓︎
- Updated Goja dependency from v0.0.0-20220102113305-2298ace6d09d to v0.0.0-20250630131328-58d95d85e994.
- Updated the following crates:
faketo version4.3.0andrandto version0.9.1 - Upgraded the
golang.org/x/oauth2library from version0.23to0.27to resolve a high-severity vulnerability (CVE-2025-22868). The issue allowed attackers to send malformed tokens that could cause excessive memory consumption during parsing.
Related advisory: GHSA-6v2p-p543-phr9 - Updated go runtime from version 1.23.0 to 1.23.3 to address a segfault during autoingest.
UI⚓︎
- Upgraded form-data library to Upgrades 2.5.4, 3.0.4, 4.0.4 or above to prevent
Math.random()-related security vulnerability.
Improvements⚓︎
Config API⚓︎
- Added
/downloadendpoint toDictionaryFileresource that downloads the dictionary file using the same permissions asview_dictionaryfile - Implemented service account token generation within the config API. This allows more fine-grained control over token generation, TTLs, and revocations.
- Updated the
/defaultsConfig API endpoint to return updated default settings values. Updated schema for multiple endpoints for more clarity on the 'settings' field. Changed the batch job defaultmax_minutes_per_partitionsfrom360to60. - Support for new
ipanduuiddatatypes has been added. See Support for newipanduuiddatatypes for more information. - Enabled storing HDX Deployment ID (
deployment_id) on multi-tenant Project resources. - API query string parameter values for the
/projectendpoint are no longer converted to lowercase, preserving case-sensitivity.
Cluster Operations⚓︎
- Hydrolix now includes both cluster-level and project-level deployment IDs in all usage reports, logs, and monitoring data.
- This ensures traceability across
usage.meter,hydro.logs, and alerting pipelines. -
Deployment ID sources:
HDX_DEPLOYMENT_ID: cluster-wide,env varor CLI flagproject.hdx_deployment_id: per project, config data
-
Improved initialization logic to ensure the
hydro.logstransform includes thedeployment_idcolumn, allowing logs to be filtered by deployment. - Added a service to guarantee a transform contains specific
output_columns. - Updated
init_cito use this service to ensuredeployment_idis present. - Refactored transform business logic into
TransformSerializerfor reuse and clarity. - Creates or updates the transform and settings only if
deployment_idis missing. - Leaves other transform settings untouched.
-
Raises clear errors if a conflicting datatype for the column exists.
-
Include Hydrolix deployment ID in logs and metrics emitted by a cluster. The deployment ID is propogated to various workloads by the operator. If the operator can't obtain a deployment ID, it defaults to propagating the namespace.
-
When Grafana is enabled in-cluster, the operator creates a default Hydrolix datasource in Grafana, replacing the default ClickHouse datasource.
-
Added two metrics:
fqdn_cert_expiry, which indicates the fqdn cert expiry time in seconds, andfqdn_tls_cert_expiry_diff, which indicates if the fqdn cert and the tls.crt in traefik-tls secret differ (which should not be the case). -
Introduced multiple enhancements to the
hdx-scalerautoscaler to support more responsive and stable scaling: - Separate cooldowns:
cool_up_secondsandcool_down_secondsnow control scale-up and scale-down cooldowns independently. - Range-based scaling:
metric_minandmetric_maxdefine bounds for a metric; autoscaling adjusts pod count relative to that range. - Scaling throttles: New
scale_up_throttleandscale_down_throttleoptions limit how quickly scale adjustments are made (as a percentage). -
Tolerance window:
tolerance_upandtolerance_downdefine a dead zone around the target where no scaling is triggered. -
Added an interactive help window to the
hdx-scalerterminal UI. Improves usability for users managing scaling settings from the TUI. - Press
hto open the help panel, which explains available configuration fields. -
Press
ESCorqto close it. -
The
hdx-scalerservice now exposes detailed Prometheus metrics to improve the observability of autoscaling behavior.
These metrics include:
hdxscaler_total_scalers: The total number of autoscalers runninghdxscaler_ratio: The calculated ratio based on abs(normalized metric - normalized target) if using Metric Range mode or target metric / observed metric if using Metric Target modehdxscaler_current_replicas: The current number of replicashdxscaler_target_count: The number of targets the measured value is being averaged fromhdxscaler_measured_value: The measured average value of the target group metricshdxscaler_desired_replicas: The calculated number of replicas to scale to before applying any bounds checks, sensitizing, or throttlinghdxscaler_bounded_replicas: What desired replicas are set to after applying all bounds checks, sensitizing, and throttlinghdxscaler_normalized_metric: What the measured value is normalized to if using Metric Range as the scaling modehdxscaler_normalized_target: What the target value is normalized to if using Metric Range as the scaling modehdxscaler_delta: The difference between the normalized metric and normalized target, used in Metric Range modehdxscaler_sensitivity_calc: The value calculated by the sensitizer, used in Metric Range scaling mode to calculated the desired replicashdxscaler_throttle_up_calc: If scaling up, the max number of pods to scale up to, will be used instead of desired replicas if smallerhdxscaler_throttle_down_calc: If scaling down, the number of pods to scale down by, will be used if larger than desired replicas
Each metric (except hdxscaler_total_scalers) has the following labels:
deployment: the Deployment being scaledalias: if that deployment has an alias, i.e. intake-head, alias="stream-head"app: the service or pool name being scraped for metrics to watch-
title: The name for this autoscaler which includes deployment name and metric / metric labels that are being used for scale calculations -
HDX Scaler now supports loading its config from a Kubernetes ConfigMap using the
CONFIG_MODE=k8senvironment variable. - Faster config changes
- Default remains unchanged
-
Improves deployment stability
-
HDX Scaler now ignores services when the
scale_off:truetunable is set. This prevents accidental restarts of pods that should be scaled off. -
To fully shut down the cluster, HDX Scaler must be scaled to zero manually.
-
HDX Scaler now supports loading its config from a Kubernetes ConfigMap using the
CONFIG_MODE=k8senvironment variable. - Faster config changes
- Default remains unchanged
-
Improves deployment stability
-
HDX Scaler now ignores services when the
scale_off:truetunable is set. This prevents accidental restarts of pods that should be scaled off. -
To fully shut down the cluster, HDX Scaler must be scaled to zero manually.
-
Added regular expressions (regex) for header and query-based dynamic ingest routing. To enable regex matching, prefix the value with
regex|. See the example below:
-
Quesma can now be exposed externally to operate with an existing Kibana deployment using the
enable_public_accesstunable. See Enable public access to Quesma for more information. -
Gunicorn will now scale up async workers in accordance with the number of CPUs available to the
hdx-traefik-authcontainer, saving administrators time when adding or removing resources from Traefik. This behavior is overrideable using thehdx_traefik_auth_workerstunable. -
Improved how email is handled during cluster creation and password resets, especially for customer-deployed environments.
- Cluster creation emails are now only sent to
ADMIN_EMAIL(if set). If not set, no email is sent. - Prevented all emails (invites, access codes, password resets) from being sent to
CI_USER_EMAIL(e.g.hdx@hydrolix.netor$OWNER@hydrolix.net). - Invite logic now matches password reset logic for email suppression.
-
Added support for custom SMTP configuration to allow customer-managed delivery.
-
Added support for updating the patch-level of third party software in internal images. Added
patch_datetunable.
Intake and Merge⚓︎
- Added support to the Vector software inside the cluster allowing specification of Bearer tokens to accompany messages sent to
logs_sink_remoteendpoints. This facilitates the use of service accounts to send logs to remote endpoints. - Support for new
ipanduuiddatatypes has been added. See Support for newipanduuiddatatypes for more information. - Removed the
Serializederivation from objects in the common crate, reducing redundancy in the code. - Added a new
kafka_sasl_plaincredential type for use with Kafka ingest. Thekafka-peercan now authenticate with SASL_SSL usingusernameandpasswordcredentials.
Query⚓︎
- Replaced the use of
min_timestampandmax_timestampin catalog queries to use a more specific SQL condition. This prevents contradictory conditions from fetching any partitions. (For example,WHERE primary < timestamp AND primary > timestampshould return no results) - Fixed incorrect table name in log messages like the following:
db='{db_name}' Storage='TurbineStorage' removed(TurbineStoragebeing incorrect) - Columns in the GROUP BY clause in summary table SQL may now be removed from the summary table SQL. This allows a high-cardinality column to be removed from a summary table without dropping and re-creating the summary table.
- Support for new
ipanduuiddatatypes has been added. See Support for newipanduuiddatatypes for more information.
Security⚓︎
- Set a
HYDROLIX_TOKENcookie using API using the attributessecure,httponly, andsamesite. Use/users/currentAPI endpoint to determine expired tokens and clear cookies. - Support for row-level access control. See Notable new features for details.
UI⚓︎
- Support for row-level access control. See Notable new features for details.
- The UI's browser tab title is now set to the cluster hostname to ease the managment of multiple clusters.
- Added a "Column Management" sidebar to the UI in Advanced Options. This feature is only available for non-summary tables.
- View and edit current and past column names
- Optionally set an alias as the current display name
- Add new aliases to existing columns
Bug Fixes⚓︎
Config API⚓︎
- The
add_namesendpoint now adds column names, rather than returning an HTTP 500 error. - Pools-related RBAC permissions are now checked properly so only authorized users can edit resource pools in the API and UI.
Intake and Merge⚓︎
- Added the ability to split merge target bounds into smaller chunks. This is an attempt to address an issue with slow merges for late-arriving data in Era 3.
- The
activefield in transform is now optional. - Support for new
ipanduuiddatatypes has been added. - The new merge controller used to expect strings for the
minandmaxsettings for field limits. It now expects integers. - The merge controller now properly handles
nullor emptyshard_keycolumns in the catalog. - Reattached the
duplicate_partitionsmetric, which was unintentionally disconnected when theAdaptivePartitionSourceabstraction was introduced. The metric now tracks duplicate partitions during sourcing and consistently updates in telemetry. - Changed a misleading
WARNINGmessage inintake-headtoINFOduring rejector startup. This reduces noise in logs and prevents confusion during troubleshooting.
No functional changes or documentation impact. - Added a safeguard to prevent a rare buffer overrun panic in the custom CSV reader used by
intake-head. - The panic was triggered by an out-of-bounds write during CSV ingestion, likely due to a misaligned buffer size and input row length.
- A pre-write check was added to grow the destination buffer as needed.
- The patch does not fix the root cause of the excessive buffer size, but avoids the crash and allows the ingestion to recover gracefully.
- Also improved support for custom-sized buffer requests by configuring
bufio.Scannerto handle larger token lengths as needed.
Cluster Operations⚓︎
hdx-scalernow recognizes its configuration settings.- Fixed the install for
hdx-scaler, putting it in/usr/local/bin/where it's on the system $PATH. - The new cluster login invitation e-mail now give the user the correct URL to log into the cluster.
- The ACME init job no longer performs a separate, unneeded DNS resolution check before attempting certificate issuance. This resolves issues where DNS records existed but pods couldn’t see them due to stale cache.
- Removed the standalone DNS check from the
init-acmecontainer. - Consolidated route verification into the
start-legoscript using HTTP-based validation. - Mounted the
acme-accountsecret ininit-acmeto enable reuse of existing account data. - Restructured the ACME logic to be more modular and resilient for both job and cronjob execution.
UI⚓︎
- Fixed a bug with disappearing data in the Transform SQL field.
- Updated the total volume, compression ratio and compression percent calculations for table details used in UI analysis to correctly use
raw_sizeinstead ofmem_size. - The Hydrolix UI no longer relies on
HYDROLIX_USER_IDandHYDROLIX_ORGANIZATION_IDcookies. Instead, session continuity uses onlyHYDROLIX_TOKEN,
with user details from the authenticated/currentendpoint.
This fixes an issue where the UI would fail to load properly if only the session token cookie was present. - Error messages for bad requests to alter/batch jobs are more descriptive.
- Pages that don't use API pagination now show full lists of items.