23 April 2025 - v5.1.2

Column renaming, support for service accounts, simplified Grafana deployment.

Notable New Features

Column renaming

Create additional names for existing columns to correct earlier naming choices or adapt to new naming conventions.
- The most recently added column name becomes the primary name.
- All prior names refer to the same table column.
- Renaming is not supported for summary tables.
- Introduces /orgs/:org_id/projects/:project_id/tables/:table_id/columns GET and POST endpoints for managing column names and aliases.
- See Breaking changes on transform conflicts before upgrading.

New service accounts

Users with super_user or All permissions can create service accounts.
Service accounts can generate JSON Web Tokens (JWTs) with a lifetime of 365 days.
New underlying APIs implement support for managing service accounts.
- /config/v1/service_accounts
- /config/v1/service_accounts/:service_account_id/tokens

Simplified Grafana deployment

Deployment of Grafana is now provided automatically in Hydrolix.

Breaking changes

🚧
Transform conflicts must be resolved before upgrade
Install hdxcli and use hdxcli check-health to see and fix transform conflicts.
A transform conflict can occur when two existing transforms disagree about column types or attributes. Previously, it was possible to create a conflict, for example with the force_operation flag, intended to fix badly broken summary transforms. (The force_operation flag also disappears with this release.)
The new column renaming feature is incompatible with transform conflicts. If you have questions or need help, please contact Hydrolix Customer Success.
Upgrading to 5.1 will fail if transform conflicts are not resolved first.

Upgrade

Upgrade on Google Kubernetes Engine

kubectl apply -f "https://www.hydrolix.io/operator/v5.1.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

Upgrade on Amazon EKS

kubectl apply -f "https://www.hydrolix.io/operator/v5.1.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

Upgrade on Linode Kubernetes Engine

kubectl apply -f "https://www.hydrolix.io/operator/v5.1.2/operator-resources?namespace=$HDX_KUBERNETES_NAMESPACE"

Changelog

UI improvements

Added shadow tables settings to the transform management UI.
Redesigned Summary Table to improve column analysis for many columns and ease interaction and provide more feedback.
Allowed searchability in storage drop down lists. This improves usability in clusters with many storage options. See Table > Bucket settings > Storage mapping.
Removed "force operation" from UI. See also column renaming and transform conflicts.
Included visual differentiators and higher contrast between foreground and background colors to improve accessibility on Jobs page.
Added ability to select multiple event types on the Security > Auth Logs page.
Updated the pool vertical ellipsis menu to verify the user's intent. Added Restore Defaults when selecting Delete to confirm reset to default, or deletion.

Authentication and permission improvements

Introduced service account management feature to the API. Service accounts can now generate 365 day authentication tokens.

Ingest improvements

Introduced queue polling to measure cloud queue depth to improve visibility of autoingest backlog. Metrics are now available for AWS SQS, Google PubSub, and Azure Service Bus.
When creating a local storage directory for an object store, Hydrolix only uses the bucket name as the subdirectory. The root directory is always set as /.
Introduced Google PubSub authentication support, using Kubernetes credentials stored in gcp_service_account_keys.

Merge and data lifecycle improvements

Introduced a partition cleaner service to handle log and reject file vacuuming jobs. Add a periodic keycloak backup vacuum. Remove unnecessary tunables.
Added a duty cycle metric to the merge controller, to incorporate with future, load-aware cluster scaling.
Exposed merge health metrics to prometheus and directly via HTTP, to improve observability of merge controller behavior.

API improvements

Introduce column renaming support in API and core.
Removed sync-catalog to reduce database locks that could cause data loss.

Cluster operation improvements

Automated deployment of Grafana (OSS or Enterprise) into the cluster by setting data_visualization_tools tunable.
Introduced a high-availability cache for the Keycloak authentication cache inside the cluster. Sessions will no longer be interrupted when routine keycloak pod maintenance occurs.
Added CSV output option to Hydrolix Kubernetes Tool (HKT), allowing textual listing of all tunables.
Added a cluster spec diffing tool to HKT, to ease comparison operations.
Added hdx-node in Rust with feature parity and support for managing iptables port lists. This improves behavior for LKE clusters.
Introduced richer startup logic for ZooKeeper health checking, to handle both development and high-availability configurations.

Hydrolix engine improvements

Corrected handling of presigned URLs for summary tables. This fixes HdxPeerSource error when using turbine_summary_url.

Bug fixes

UI

Upgraded JavaScript Next.js library to fix CVE-2025-29927, an authorization bypass vulnerability.
Display No data on Security > Credentials when no credentials are present.
Added confirmation upon successful role creation.
Corrected date picker and API timestamp formatting mismatch for batch and alter jobs pages.
Switched to re2js to avoid a regular expression denial of service (ReDoS). Also updated axios library to address CVE-2025-27152, to avoid server side request forgery (SSRF) vulnerabilities.
Reset the default flag when cloning transforms in the UI. This prevents accidental replacement of the default transform, and allows users to set the toggle while manipulating the new transform.
Fixed an edge case in the credential filtering dropdown menu. Formerly, when AWS was selected, credential types for other clouds were included.
Fixed an issue where error messages for required fields were displayed in the advanced options sidebar.

API

Changed user deletion behavior to allow user deletion only when all related batch or alter jobs are in terminal states: done, canceled, or failed. Otherwise, return a 400 error if any job is in ready, running, or pending states.
Added the intake_head_urlfield to show the intake-head URL in list and detail views.
Recreated hydrologs Kafka data source under downgrade scenarios. Ensures smooth upgrade and version rollback.

Hydrolix engine

Prevented catalog query segfault by correcting thread safety on a PostgreSQL client connection variable.
Fixed "Timestamp must have format" error in summary transform SQL by populating datetime format entries on maps and arrays.
Fixed a case of missing row results when using LIMIT on tiny partitions. In rare cases involving LIMIT and tiny partitions, rows were returned too early.

Cluster operations

Fixed a behavior allowing a lingering load balancer after expected deletion.
Stopped sending all kopf logs to the k8s event API, decreasing API server flooding.
Fixed SQL injection bug by switching to a parameterized form. Upgraded Go network library to fix HTTP proxy bypass using IPv6 zone IDs CVE-2025-22870.
Removed ttlSecondsAfterFinished from cluster initialization jobs to prevent them from running inadvertently.
Fixed invalid default URL for automatic Quesma configuration. It now refers to the query head.
Prevented Traefik failing to reload a newer TLS certificate by including certificate timestamp metadata in Traefik's dynamic configuration file.
Prevented loading certificates for use with Kafka on the hydro.logs pool. The hydro.logs pool is for internal use only, but runs its own Kafka intake source.

Ingest

Stopped performing unnecessary allocation on intake heads when locating column index for catch_all and catch_rejects columns.
Streamlined merge operations to focus only on the necessary fields to optimize merge processes.
Introduced a new Cloud for cases where a vendor isn't AWS, but does use S3. Hydrolix now inspects both the configured cloud string and the configured endpoint if present. This improves Linode cloud support specifically, but works with any vendor that uses S3.
Shrink 26 histograms emitted from the intake system to decrease cardinality and reduce pressure on the Prometheus time series database.
Switch partition cleaner to a scheduled task, rather than constantly running. This limits frequency of lisiting storage buckets.

Changes to metrics and tunables

Removed the following tunables that control log and reject files vacuuming:

log_vacuum_enabled
log_vacuum_schedule
log_vacuum_max_age
log_vacuum_concurrency
log_vacuum_dry_run
rejects_vacuum_enabled
rejects_vacuum_schedule
rejects_vacuum_max_age
rejects_vacuum_concurrency
rejects_vacuum_dry_run

Added new metrics for the periodic jobs:

keycloak_deletes
keycloak_delete_size
keycloak_failures
keycloak_visited
reject_deletes
reject_delete_size
reject_failures
reject_visited
log_deletes
log_delete_size
log_failures
log_visited

Added new metric autoingest_queue_messages that uses labels provider and queue_name to report on queue depth for autoingestion.

Added new tunables for hdx-node internode health checking:

hdx_node_enabled, boolean, defaults to False, replaces hdx_node
hdx_node_config, YAML