23 April 2025 - v5.1.2
Column renaming, support for service accounts, simplified Grafana deployment.
Notable New Features
Column renaming
- Create additional names for existing columns to correct earlier naming choices or adapt to new naming conventions.
- The most recently added column name becomes the primary name.
- All prior names refer to the same table column.
- Renaming is not supported for summary tables.
- Introduces
/orgs/:org_id/projects/:project_id/tables/:table_id/columns
GET and POST endpoints for managing column names and aliases. - See Breaking changes on transform conflicts before upgrading.
New service accounts
- Users with
super_user
orAll
permissions can create service accounts. - Service accounts can generate JSON Web Tokens (JWTs) with a lifetime of 365 days.
- New underlying APIs implement support for managing service accounts.
/config/v1/service_accounts
/config/v1/service_accounts/:service_account_id/tokens
Simplified Grafana deployment
Deployment of Grafana is now provided automatically in Hydrolix.
Breaking changes
Transform conflicts must be resolved before upgrade
Install hdxcli and use
hdxcli check-health
to see and fix transform conflicts.
A transform conflict can occur when two existing transforms disagree about column types or attributes. Previously, it was possible to create a conflict, for example with theforce_operation
flag, intended to fix badly broken summary transforms. (Theforce_operation
flag also disappears with this release.)The new column renaming feature is incompatible with transform conflicts. If you have questions or need help, please contact Hydrolix Customer Success.
Upgrading to 5.1 will fail if transform conflicts are not resolved first.
Upgrade
Upgrade on Google Kubernetes Engine
kubectl apply -f "https://www.hydrolix.io/operator/v5.1.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"
Upgrade on Amazon EKS
kubectl apply -f "https://www.hydrolix.io/operator/v5.1.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"
Upgrade on Linode Kubernetes Engine
kubectl apply -f "https://www.hydrolix.io/operator/v5.1.2/operator-resources?namespace=$HDX_KUBERNETES_NAMESPACE"
Changelog
UI improvements
- Added shadow tables settings to the transform management UI.
- Redesigned Summary Table to improve column analysis for many columns and ease interaction and provide more feedback.
- Allowed searchability in storage drop down lists. This improves usability in clusters with many storage options. See Table > Bucket settings > Storage mapping.
- Removed "force operation" from UI. See also column renaming and transform conflicts.
- Included visual differentiators and higher contrast between foreground and background colors to improve accessibility on Jobs page.
- Added ability to select multiple event types on the Security > Auth Logs page.
- Updated the pool vertical ellipsis menu to verify the user's intent. Added Restore Defaults when selecting Delete to confirm reset to default, or deletion.
Authentication and permission improvements
- Introduced service account management feature to the API. Service accounts can now generate 365 day authentication tokens.
Ingest improvements
- Introduced queue polling to measure cloud queue depth to improve visibility of autoingest backlog. Metrics are now available for AWS SQS, Google PubSub, and Azure Service Bus.
- When creating a local storage directory for an object store, Hydrolix only uses the bucket name as the subdirectory. The root directory is always set as
/
. - Introduced Google PubSub authentication support, using Kubernetes credentials stored in
gcp_service_account_keys
.
Merge and data lifecycle improvements
- Introduced a partition cleaner service to handle log and reject file vacuuming jobs. Add a periodic keycloak backup vacuum. Remove unnecessary tunables.
- Added a duty cycle metric to the merge controller, to incorporate with future, load-aware cluster scaling.
- Exposed merge health metrics to prometheus and directly via HTTP, to improve observability of merge controller behavior.
API improvements
- Introduce column renaming support in API and core.
- Removed
sync-catalog
to reduce database locks that could cause data loss.
Cluster operation improvements
- Automated deployment of Grafana (OSS or Enterprise) into the cluster by setting
data_visualization_tools
tunable. - Introduced a high-availability cache for the Keycloak authentication cache inside the cluster. Sessions will no longer be interrupted when routine keycloak pod maintenance occurs.
- Added CSV output option to Hydrolix Kubernetes Tool (HKT), allowing textual listing of all tunables.
- Added a cluster spec diffing tool to HKT, to ease comparison operations.
- Added
hdx-node
in Rust with feature parity and support for managing iptables port lists. This improves behavior for LKE clusters. - Introduced richer startup logic for ZooKeeper health checking, to handle both development and high-availability configurations.
Hydrolix engine improvements
- Corrected handling of presigned URLs for summary tables. This fixes
HdxPeerSource
error when usingturbine_summary_url
.
Bug fixes
UI
- Upgraded JavaScript Next.js library to fix CVE-2025-29927, an authorization bypass vulnerability.
- Display No data on Security > Credentials when no credentials are present.
- Added confirmation upon successful role creation.
- Corrected date picker and API timestamp formatting mismatch for batch and alter jobs pages.
- Switched to re2js to avoid a regular expression denial of service (ReDoS). Also updated axios library to address CVE-2025-27152, to avoid server side request forgery (SSRF) vulnerabilities.
- Reset the default flag when cloning transforms in the UI. This prevents accidental replacement of the default transform, and allows users to set the toggle while manipulating the new transform.
- Fixed an edge case in the credential filtering dropdown menu. Formerly, when AWS was selected, credential types for other clouds were included.
- Fixed an issue where error messages for required fields were displayed in the advanced options sidebar.
API
- Changed user deletion behavior to allow user deletion only when all related batch or alter jobs are in terminal states:
done
,canceled
, orfailed
. Otherwise, return a400
error if any job is inready
,running
, orpending
states. - Added the
intake_head_url
field to show theintake-head
URL in list and detail views. - Recreated hydrologs Kafka data source under downgrade scenarios. Ensures smooth upgrade and version rollback.
Hydrolix engine
- Prevented catalog query segfault by correcting thread safety on a PostgreSQL client connection variable.
- Fixed "Timestamp must have format" error in summary transform SQL by populating datetime
format
entries on maps and arrays. - Fixed a case of missing row results when using
LIMIT
on tiny partitions. In rare cases involvingLIMIT
and tiny partitions, rows were returned too early.
Cluster operations
- Fixed a behavior allowing a lingering load balancer after expected deletion.
- Stopped sending all
kopf
logs to thek8s
event API, decreasing API server flooding. - Fixed SQL injection bug by switching to a parameterized form. Upgraded Go network library to fix HTTP proxy bypass using IPv6 zone IDs CVE-2025-22870.
- Removed
ttlSecondsAfterFinished
from cluster initialization jobs to prevent them from running inadvertently. - Fixed invalid default URL for automatic Quesma configuration. It now refers to the query head.
- Prevented Traefik failing to reload a newer TLS certificate by including certificate timestamp metadata in Traefik's dynamic configuration file.
- Prevented loading certificates for use with Kafka on the
hydro.logs
pool. Thehydro.logs
pool is for internal use only, but runs its own Kafka intake source.
Ingest
- Stopped performing unnecessary allocation on intake heads when locating column index for catch_all and catch_rejects columns.
- Streamlined merge operations to focus only on the necessary fields to optimize merge processes.
- Introduced a new
Cloud
for cases where a vendor isn't AWS, but does use S3. Hydrolix now inspects both the configured cloud string and the configured endpoint if present. This improves Linode cloud support specifically, but works with any vendor that uses S3. - Shrink 26 histograms emitted from the intake system to decrease cardinality and reduce pressure on the Prometheus time series database.
- Switch partition cleaner to a scheduled task, rather than constantly running. This limits frequency of lisiting storage buckets.
Changes to metrics and tunables
Removed the following tunables that control log and reject files vacuuming:
log_vacuum_enabled
log_vacuum_schedule
log_vacuum_max_age
log_vacuum_concurrency
log_vacuum_dry_run
rejects_vacuum_enabled
rejects_vacuum_schedule
rejects_vacuum_max_age
rejects_vacuum_concurrency
rejects_vacuum_dry_run
Added new metrics for the periodic
jobs:
keycloak_deletes
keycloak_delete_size
keycloak_failures
keycloak_visited
reject_deletes
reject_delete_size
reject_failures
reject_visited
log_deletes
log_delete_size
log_failures
log_visited
Added new metric autoingest_queue_messages
that uses labels provider
and queue_name
to report on queue depth for autoingestion.
Added new tunables for hdx-node
internode health checking:
hdx_node_enabled
, boolean, defaults to False, replaceshdx_node
hdx_node_config
, YAML