25 August 2025 - v5.5.0
Row-level access control, ip
and uuid
datatypes, column alias management in UI
Notable new features
Row-level access control
- Data can now be filtered for users based on the contents of the row and users' assigned roles.
- Each filter is implemented as a row policy.
- RBAC roles control which users are affected by which row policies.
- Learn more in Row-Level Access Control
Column alias management for data tables
You can now create and manage column aliases directly from the UI. Tabs are fully synchronized to preserve input when switching.
- Column alias list
- In the UI
data/tables/{table_id}/column-management/
: View existing aliases with columns for Alias and Expression
- In the UI
- Create new alias
- In the UI
/data/tables/{table_id}/column-management/new-column-alias/
- Alias form: Add a name and expression
- Analyze expression:
- Select optional columns to preview with the alias
- See generated SQL, copy it, and set row limits
- Run a test query to preview results
- In the UI
- Edit alias
- In the UI
/data/tables/{table_id}/column-management/{alias_id}/
: Edit or delete existing aliases
- In the UI
Support for new ip
and uuid
datatypes
ip
and uuid
datatypesIP addresses and UUIDs may now be used as a first-class datatype in transforms, ingest, and queries.
-
The
ip
datatype supports both IPv4 and IPv6 addresses. Here's an example of field definitions in a transform:{ "name": "ipv4_field", "datatype": { "type": "ip", "default": "127.0.0.1" } } { "name": "ipv6_field", "datatype": { "type": "ip", "default": "fe80::22bb:2ebd:da90:3439" } }
-
The
uuid
datatype supports standard 128-bit UUIDs. Here's an example of a field definition in a transform:{ "name": "uuid_field", "datatype": { "type": "uuid", "default": "88a7bdff-ce1f-4ce6-ae6f-f506937e315c" } }
Upgrade instructions (update the Hydrolix version number in the links below)
-
Apply the new Hydrolix operator
If you have a self-managed installation, apply the new operator directly with the
kubectl
command examples below. If you're using Hydrolix-supplied tools to manage your installation, follow the procedure prescribed by those tools.GKEkubectl apply -f "https://www.hydrolix.io/operator/v5.5.0/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"
EKSkubectl apply -f "https://www.hydrolix.io/operator/v5.5.0/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"
LKE and AKSkubectl apply -f "https://www.hydrolix.io/operator/v5.5.0/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}"
-
Monitor the upgrade process
Kubernetes jobs named
init-cluster
andinit-turbine-api
will automatically run to upgrade your entire installation to match the new operator's version number. This will take a few minutes, during which time you can observe your pods' restarts with your favorite Kubernetes monitor tool.Ensure both the
init-cluster
andinit-turbine-api
jobs have completed successfully and that theturbine-api
pod has restarted without errors. After that, view the UI and use the API of your new installation as a final check.If the
turbine-api
pod doesn't restart properly or other functionality is missing, check the logs of theinit-cluster
andinit-turbine-api
jobs for details about failures. This can be done using thek9s
utility or with thekubectl
command:% kubectl logs -l app=init-cluster % kubectl logs -l app=init-turbine-api
If you still need help, contact Hydrolix support.
Changelog
Updates
These changes include version upgrades and internal dependency bumps.
Cluster Operations
- Upgraded gunicorn from 20.0.4 -> 20.1.0 to prevent cluster problems. See more information about the bugfix here: https://github.com/benoitc/gunicorn/issues/1913
- Upgraded kopf and update operator RBAC from v0.0.0-20220102113305-2298ace6d09d to v0.0.0-20250630131328-58d95d85e994.
- Upgraded HTTP Proxy to version 0.4.0.
- Switched from
pycryptodome
topycryptodomex
version3.23.0
to eliminate a security vulnerability.
Intake and Merge
- Updated Goja dependency from v0.0.0-20220102113305-2298ace6d09d to v0.0.0-20250630131328-58d95d85e994.
- Updated the following crates:
fake
to version4.3.0
andrand
to version0.9.1
- Upgraded the
golang.org/x/oauth2
library from version0.23
to0.27
to resolve a high-severity vulnerability (CVE-2025-22868). The issue allowed attackers to send malformed tokens that could cause excessive memory consumption during parsing.
Related advisory: GHSA-6v2p-p543-phr9 - Updated go runtime from version 1.23.0 to 1.23.3 to address a segfault during autoingest.
UI
- Upgraded form-data library to Upgrades 2.5.4, 3.0.4, 4.0.4 or above to prevent
Math.random()
-related security vulnerability.
Improvements
Config API
- Added
/download
endpoint toDictionaryFile
resource that downloads the dictionary file using the same permissions asview_dictionaryfile
- Implemented service account token generation within the config API. This allows more fine-grained control over token generation, TTLs, and revocations.
- Updated the
/defaults
Config API endpoint to return updated default settings values. Updated schema for multiple endpoints for more clarity on the 'settings' field. Changed the batch job defaultmax_minutes_per_partitions
from360
to60
. - Support for new
ip
anduuid
datatypes has been added. See Support for newip
anduuid
datatypes for more information. - Enabled storing HDX Deployment ID (
deployment_id
) on multi-tenant Project resources. - API query string parameter values for the
/project
endpoint are no longer converted to lowercase, preserving case-sensitivity.
Cluster Operations
-
Hydrolix now includes both cluster-level and project-level deployment IDs in all usage reports, logs, and monitoring data.
- This ensures traceability across
usage.meter
,hydro.logs
, and alerting pipelines. - Deployment ID sources:
HDX_DEPLOYMENT_ID
: cluster-wide,env var
or CLI flagproject.hdx_deployment_id
: per project, config data
- This ensures traceability across
-
Improved initialization logic to ensure the
hydro.logs
transform includes thedeployment_id
column, allowing logs to be filtered by deployment.- Added a service to guarantee a transform contains specific
output_columns
. - Updated
init_ci
to use this service to ensuredeployment_id
is present. - Refactored transform business logic into
TransformSerializer
for reuse and clarity. - Creates or updates the transform and settings only if
deployment_id
is missing. - Leaves other transform settings untouched.
- Raises clear errors if a conflicting datatype for the column exists.
- Added a service to guarantee a transform contains specific
-
Include Hydrolix deployment ID in logs and metrics emitted by a cluster. The deployment ID is propogated to various workloads by the operator. If the operator can't obtain a deployment ID, it defaults to propagating the namespace.
-
When Grafana is enabled in-cluster, the operator creates a default Hydrolix datasource in Grafana, replacing the default ClickHouse datasource.
-
Added two metrics:
fqdn_cert_expiry
, which indicates the fqdn cert expiry time in seconds, andfqdn_tls_cert_expiry_diff
, which indicates if the fqdn cert and the tls.crt in traefik-tls secret differ (which should not be the case). -
Introduced multiple enhancements to the
hdx-scaler
autoscaler to support more responsive and stable scaling:- Separate cooldowns:
cool_up_seconds
andcool_down_seconds
now control scale-up and scale-down cooldowns independently. - Range-based scaling:
metric_min
andmetric_max
define bounds for a metric; autoscaling adjusts pod count relative to that range. - Scaling throttles: New
scale_up_throttle
andscale_down_throttle
options limit how quickly scale adjustments are made (as a percentage). - Tolerance window:
tolerance_up
andtolerance_down
define a dead zone around the target where no scaling is triggered.
- Separate cooldowns:
-
Added an interactive help window to the
hdx-scaler
terminal UI. Improves usability for users managing scaling settings from the TUI.- Press
h
to open the help panel, which explains available configuration fields. - Press
ESC
orq
to close it.
- Press
-
The
hdx-scaler
service now exposes detailed Prometheus metrics to improve the observability of autoscaling behavior.These metrics include:
hdxscaler_total_scalers
: The total number of autoscalers runninghdxscaler_ratio
: The calculated ratio based on abs(normalized metric - normalized target) if using Metric Range mode or target metric / observed metric if using Metric Target modehdxscaler_current_replicas
: The current number of replicashdxscaler_target_count
: The number of targets the measured value is being averaged fromhdxscaler_measured_value
: The measured average value of the target group metricshdxscaler_desired_replicas
: The calculated number of replicas to scale to before applying any bounds checks, sensitizing, or throttlinghdxscaler_bounded_replicas
: What desired replicas are set to after applying all bounds checks, sensitizing, and throttlinghdxscaler_normalized_metric
: What the measured value is normalized to if using Metric Range as the scaling modehdxscaler_normalized_target
: What the target value is normalized to if using Metric Range as the scaling modehdxscaler_delta
: The difference between the normalized metric and normalized target, used in Metric Range modehdxscaler_sensitivity_calc
: The value calculated by the sensitizer, used in Metric Range scaling mode to calculated the desired replicashdxscaler_throttle_up_calc
: If scaling up, the max number of pods to scale up to, will be used instead of desired replicas if smallerhdxscaler_throttle_down_calc
: If scaling down, the number of pods to scale down by, will be used if larger than desired replicas
Each metric (except
hdxscaler_total_scalers
) has the following labels:deployment
: the Deployment being scaledalias
: if that deployment has an alias, i.e. intake-head, alias="stream-head"app
: the service or pool name being scraped for metrics to watchtitle
: The name for this autoscaler which includes deployment name and metric / metric labels that are being used for scale calculations
-
HDX Scaler now supports loading its config from a Kubernetes ConfigMap using the
CONFIG_MODE=k8s
environment variable.- Faster config changes
- Default remains unchanged
- Improves deployment stability
-
HDX Scaler now ignores services when the
scale_off:true
tunable is set. This prevents accidental restarts of pods that should be scaled off.- To fully shut down the cluster, HDX Scaler must be scaled to zero manually.
-
HDX Scaler now supports loading its config from a Kubernetes ConfigMap using the
CONFIG_MODE=k8s
environment variable.- Faster config changes
- Default remains unchanged
- Improves deployment stability
-
HDX Scaler now ignores services when the
scale_off:true
tunable is set. This prevents accidental restarts of pods that should be scaled off.- To fully shut down the cluster, HDX Scaler must be scaled to zero manually.
-
Added regular expressions (regex) for header and query-based dynamic ingest routing. To enable regex matching, prefix the value with
regex|
. See the example below:routing: headers: x-hdx-table: regex|staging-project-(alpha|beta|gamma) x-hdx-transform: shared-staging-transform query_params: table: regex|customer_[0-9]+ transform: tenant-transform
-
Quesma can now be exposed externally to operate with an existing Kibana deployment using the
enable_public_access
tunable. See Enable public access to Quesma for more information. -
Gunicorn will now scale up async workers in accordance with the number of CPUs available to the
hdx-traefik-auth
container, saving administrators time when adding or removing resources from Traefik. This behavior is overrideable using thehdx_traefik_auth_workers
tunable. -
Improved how email is handled during cluster creation and password resets, especially for customer-deployed environments.
- Cluster creation emails are now only sent to
ADMIN_EMAIL
(if set). If not set, no email is sent. - Prevented all emails (invites, access codes, password resets) from being sent to
CI_USER_EMAIL
(e.g.[email protected]
or[email protected]
). - Invite logic now matches password reset logic for email suppression.
- Added support for custom SMTP configuration to allow customer-managed delivery.
- Cluster creation emails are now only sent to
-
Added support for updating the patch-level of third party software in internal images. Added
patch_date
tunable.
Intake and Merge
- Added support to the Vector software inside the cluster allowing specification of Bearer tokens to accompany messages sent to
logs_sink_remote
endpoints. This facilitates the use of service accounts to send logs to remote endpoints. - Support for new
ip
anduuid
datatypes has been added. See Support for newip
anduuid
datatypes for more information. - Removed the
Serialize
derivation from objects in the common crate, reducing redundancy in the code. - Added a new
kafka_sasl_plain
credential type for use with Kafka ingest. Thekafka-peer
can now authenticate with SASL_SSL usingusername
andpassword
credentials.
Query
- Replaced the use of
min_timestamp
andmax_timestamp
in catalog queries to use a more specific SQL condition. This prevents contradictory conditions from fetching any partitions. (For example,WHERE primary < timestamp AND primary > timestamp
should return no results) - Fixed incorrect table name in log messages like the following:
db='{db_name}' Storage='TurbineStorage' removed
(TurbineStorage
being incorrect) - Columns in the GROUP BY clause in summary table SQL may now be removed from the summary table SQL. This allows a high-cardinality column to be removed from a summary table without dropping and re-creating the summary table.
- Support for new
ip
anduuid
datatypes has been added. See Support for newip
anduuid
datatypes for more information.
Security
- Set a
HYDROLIX_TOKEN
cookie using API using the attributessecure
,httponly
, andsamesite
. Use/users/current
API endpoint to determine expired tokens and clear cookies. - Support for row-level access control. See Notable new features for details.
UI
- Support for row-level access control. See Notable new features for details.
- The UI's browser tab title is now set to the cluster hostname to ease the managment of multiple clusters.
- Added a "Column Management" sidebar to the UI in Advanced Options. This feature is only available for non-summary tables.
- View and edit current and past column names
- Optionally set an alias as the current display name
- Add new aliases to existing columns
Bug Fixes
Config API
- The
add_names
endpoint now adds column names, rather than returning an HTTP 500 error. - Pools-related RBAC permissions are now checked properly so only authorized users can edit resource pools in the API and UI.
Intake and Merge
- Added the ability to split merge target bounds into smaller chunks. This is an attempt to address an issue with slow merges for late-arriving data in Era 3.
- The
active
field in transform is now optional. - Support for new
ip
anduuid
datatypes has been added. - The new merge controller used to expect strings for the
min
andmax
settings for field limits. It now expects integers. - The merge controller now properly handles
null
or emptyshard_key
columns in the catalog. - Reattached the
duplicate_partitions
metric, which was unintentionally disconnected when theAdaptivePartitionSource
abstraction was introduced. The metric now tracks duplicate partitions during sourcing and consistently updates in telemetry. - Changed a misleading
WARNING
message inintake-head
toINFO
during rejector startup. This reduces noise in logs and prevents confusion during troubleshooting.
No functional changes or documentation impact. - Added a safeguard to prevent a rare buffer overrun panic in the custom CSV reader used by
intake-head
.- The panic was triggered by an out-of-bounds write during CSV ingestion, likely due to a misaligned buffer size and input row length.
- A pre-write check was added to grow the destination buffer as needed.
- The patch does not fix the root cause of the excessive buffer size, but avoids the crash and allows the ingestion to recover gracefully.
- Also improved support for custom-sized buffer requests by configuring
bufio.Scanner
to handle larger token lengths as needed.
Cluster Operations
hdx-scaler
now recognizes its configuration settings.- Fixed the install for
hdx-scaler
, putting it in/usr/local/bin/
where it's on the system $PATH. - The new cluster login invitation e-mail now give the user the correct URL to log into the cluster.
- The ACME init job no longer performs a separate, unneeded DNS resolution check before attempting certificate issuance. This resolves issues where DNS records existed but pods couldn’t see them due to stale cache.
- Removed the standalone DNS check from the
init-acme
container. - Consolidated route verification into the
start-lego
script using HTTP-based validation. - Mounted the
acme-account
secret ininit-acme
to enable reuse of existing account data. - Restructured the ACME logic to be more modular and resilient for both job and cronjob execution.
- Removed the standalone DNS check from the
UI
- Fixed a bug with disappearing data in the Transform SQL field.
- Updated the total volume, compression ratio and compression percent calculations for table details used in UI analysis to correctly use
raw_size
instead ofmem_size
. - The Hydrolix UI no longer relies on
HYDROLIX_USER_ID
andHYDROLIX_ORGANIZATION_ID
cookies. Instead, session continuity uses onlyHYDROLIX_TOKEN
,
with user details from the authenticated/current
endpoint.
This fixes an issue where the UI would fail to load properly if only the session token cookie was present. - Error messages for bad requests to alter/batch jobs are more descriptive.
- Pages that don't use API pagination now show full lists of items.