v5.0.0
ClickHouse upgrade, new Shadow Tables feature, Native MySQL client support, better API documentation, easier Kibana integration via Quesma, and Hydrolix Tunable Names
Notable New Features⚓︎
- ClickHouse Upgrade
- The ClickHouse library has been upgraded to version 24.8.6.70. In addition to performance improvements and bugfixes, the following functions are now available.
- Top K:
approx_top_k,topKWeighted, andapprox_top_sum - String related:
base64URLEncode,base64URLDecode,tryBase64URLDecode, andgroupConcat - Time related:
toMillisecond - Set handling:
groupArrayIntersect - Windowing:
percent_rank - and more...
- Top K:
- Native MySQL client support
- Client applications can now connect to and query a Hydrolix cluster with MySQL clients. The MySQL server listens on tcp/9004. This opens up more integration possibilities.
- Better API Documentation
- The API documentation available from the
/config/schema/endpoint has been reorganized and improved, making way for more complete API documentation in future releases. - Simplified Kibana, Quesma, and Elasticsearch Integration
- Deployment of Kibana, Quesma, and Elasticsearch is now provided automatically within Hydrolix.
- Hydrolix Tunable Names (HTN)
- Using a structured naming pattern, all tunables can be applied to multiple services, pools and containers. When defined, an HTN
htn:<service>:<pool>:<container>: <value>takes precedence over traditional key-value tunables.
Breaking Changes⚓︎
🚧 GET table_query_options has been removed
The /table_query_options API path has been removed. The same functionality is now available at the more flexible
/query_optionspath and also works on both tables and projects. See the API documentation for more information:[Table Query Options][ref config_v1_orgs_projects_tables_query_options_hierarchy_retrieve] Project Query Options
Upgrade⚓︎
Do not skip minor versions when upgrading or downgrading
Skipping versions when upgrading or downgrading Hydrolix can result in database schema inconsistencies and cluster instability. Always upgrade or downgrade sequentially through each minor version.
Example:
Upgrade from 5.5.0 → 5.6.x → 5.7.4, not 5.5.0 → 5.7.4.
Upgrade on GKE⚓︎
Upgrade on EKS⚓︎
Upgrade on LKE⚓︎
(In the sections below, remove the temporary links before publishing them. Keep this version around, though, it's useful for quick research later.)
Changelog⚓︎
General⚓︎
- Hydrolix Engine
-
Updated ClickHouse from version 23.8.10.43 to 24.8.6.70. We also updated the following libraries:
- openssl 1.1.1n -> 3.3.2
- absl 20211102.0 -> 20240722.0
- fmt 8.1.1 -> 9.1.0
- libpq 16.4 -> 16.5
-
Data Transformation
- Introduced UI support for creating, copying, viewing and customizing transform templates.
- Allowed UI download of summary table transforms.
-
Introduced support for display and edit of WURFL settings while managing transforms.
-
Ingest
- Added Shadow Tables feature. Shadow Tables only receive a certain percentage of randomly-sampled data from the input source.
- Amazon Data Firehose can now decode more than one data object from a single data segment.
- Improved the SaaS metering functionality ("Usagemeter") by adding better housekeeping and cleaning out reported data older than the number of days specified in the
usagemeter_preservetunable. -
Updated the idna library from version 0.5.0 to 1.0.3 to address a security vulnerability.
-
Query
-
Added TCP port 9004 listener for incoming queries using the MySQL dialect. This enables MySQL-compatible query tools. Queries are proxied to the ClickHouse query engine.
-
Cluster Operations
- Created a new tunable
turbine_api_require_table_default_storagerequiring explicit storage maps at table creation. This is especially useful for clusters using storage mapping. - Removed
partition-vacuum. It was replaced with a lightweightpartition-cleaner. - Switched
partition-cleanerto a continuously running service instead of a scheduled job. Removed tunablespartition_cleaner_enabledandpartition_cleaner_schedule, decrease memory usage, and introducebulk_delete_*metrics. - Converted hard-coded query peer liveness parameters into two new tunables,
query_peer_liveness_initial_delayandquery_peer_liveness_probe_timeout. - Added a tunable
prometheus_curated_configmapto offer cluster operators dynamic control over which metrics are exported to Prometheus. - Added tunables
disable_traefik_mysql_port,mysql_port, andmysql_port_disable_tlsused by thetraefikserver for MySQL configuration. - Introduced Hydrolix Tunable Names (HTN), a flexible, hierarchical system for configuring tunables across an entire cluster.
- Introduced two new tunables to improve control over the behavior of rolling updates:
rollout_strategy_max_surgeandrollout_strategy_max_unavailable. These are especially useful for deployments running near maximum capacity. -
Provided automatic deployment of Kibana, Quesma, and Elasticsearch within Hydrolix using the
data_visualization_toolsandquesma_configtunables. -
Integrations
-
Removed unnecessary references to shard keys throughout the Spark connector integration.
-
API
-
The API documentation available from the
/config/schema/endpoint has been reorganized and improved, making way for more complete API documentation in future releases. -
UI
- Improved Advanced Options rows in tables, which can now be clicked, bringing up an editor in the sidebar for better visibility.
- Introduced a multi-selection dropdown on the System Health page that includes a new All option for logs.
- Added a refresh interval of three minutes for the table health widget.
- Added a Delete Project page to improve the deletion process. The page shows statistics about ingest latency and size, and warns before deletion. Your session must have
is_superuser=trueto be allowed to delete projects. - Added a Project Health page to show relevant information, such as statistics about ingest latency and size per table.
- Upgrade Next.js from 14.2.20 to 14.2.22 to address security issues.
Bug Fixes⚓︎
- API
- Handled null storage IDs correctly in API for the presigned URL feature.
- Allowed users with single project or table permissions to see projects and tables to which they have access. Formerly, they saw nothing.
- Allowed users with all permissions to a project to use the
/projectsendpoint for listing projects. Earlier, this was endpoint was forbidden. - Ensured only a single use for any invitation. This fixes an issue allowing a second claim on the same invitation.
- Introduced a script that automatically runs on upgrade to patch up existing views that were affected by a previously fixed bug involving
datetimeanddatetime64. - Required
force_operationto delete storage if it was used in acolumn_value_mapping. - Accepted blank entries in the rust URL pre-signer for
endpoint_url. - Fixed a bug involving
burstandlimitwhen creating projects, tables, and transforms. -
Allowed the
?force_operation=truequery string parameter to bypassing verification when updating summary table SQL. This allows recovery from poor configuration. -
Authentication and Permissions
- Fixed a breaking interaction between unified auth and query string auth.
-
Removed visibility for with the
super_adminrole in Hydrolix will no longer see projects that have been marked as deleted. -
Configuration and Control
- Accepted convertible types for core tunables via
query_optionsendpoint. For example,hdx_query_unlimited_cnfis a boolean but can be "0" (false) or "1", "99", etc (true). - Ensured that the Hydrolix operator monitors changes to
curatedand applies immediately to avoid k8s and operator desynchronization. This also obviates the need to restart the operator. -
A liveness check has been added to
traefik-cfgso it will terminate instead of hanging while waiting for timeouts. -
Core and Query
- Corrected Spark connector integration to returns an empty ClickHouseRecord to the client, indicating absence of result. Earlier, it would return null, provoking a null pointer exception.
- Filtered storage IDs requiring presigning to fix Internal Server Error from
/catalog_urls. - Added resilience improvement when a disconnected peer is canceling a query. No need to crash in sensitive routine under cancellation conditions.
- Corrected a rare thread safety issue in ClickHouse 24.8.6.70 upgrade.
- Reserved use of
hdx_query_output_file_enabled=1as a query-level setting, not on table, project, or org. Improve consistency ifhdx_query_output_filenameis empty, by generating timestamped filenames. - Fixed a rare, load-induced segfault crash in
query-headby wrapping cancellation detection in a mutex. - Replaced std::string with fixed-length char array to avoid deallocation failures under heavy memory pressure. This should reduce or eliminate exit codes of 139 also lacking stack trace output.
- Prevented full table scan when SELECT queries with LIMIT find no data in the specified time range.
- Fixed an issue where a segmentation fault occurred when running a catalog query if the
turbineservice was started before thepostgresqlservice. - Fixed a segfault triggered when using the
empty()SQL function with map columns. - Detected cases of INSERT INTO an invalid
hdx_storage_id. Now exits instead of displaying a long error message. -
Fixed a memory leak involving the AWS API. Aws::ShutdownAPI() is now called when Hydrolix is finished using it.
-
Merge and Data Lifecycle
-
Included the shard key along with time range when determining eligibility for partition merging.
-
Ingest
- Avoided a possible segfault in partition handling during parse error checking. This was visible in both
intake-headandmerge-peer. - Corrected a reference counting error triggered in error-handling during batch ingestion with summary tables enabled. Symptom was infinite loop of "Outstanding partition status."
- Stopped leaking file descriptors inside connection pool management during batch, alter and merge, by always closing response bodies.
- Improved cleanup of orphaned files when upload to storage fails. Previously, a
batch-peermight hang on shutdown. - Corrected a soft-delete interaction with Azure in which files set as 'permanent delete' remained. They will now be deleted.
- Introduced better reporting of replicas when using a
GETpool endpoint. The response includes count ofcurrent_replicas, and a range forreplicas, matching the setting in thehydrolixcluster.yamlfile. - Allowed multiple autoingest sources to use the same transform. This reduces resource costs during ingestion.
- Updated autoingest to normalize URI filtering across Azure, GCP, and AWS. The overall behavior remains the same.
- Added richer error reporting during batch autoingest, including bucket name, error message, and other fields. Previously, only a short text message was returned.
- Corrected JSON-encoded storages escaping for the batch-head to read. This prevents data loss with autoingest.
-
Updated golang.org/x/net from v0.28 to v0.33 to address a security vulnerability that could allow Denial of Service (DOS) attacks.
-
UI
- Avoid losing user input in the Table Health UI widget sidebar during re-rendering or when user switches to other tabs or windows.
- Omit browser locale information to avoid causing bad SSO redirect.
- Remove duplicated text entry block for schema definition when editing dictionaries.
- Only display WURFL fields if WURFL is enabled.
- Fix multiple display issues in sidebars when surfacing validation results from API to user. Pages affected were data lifecycle management and also stream, merge, flush, bucket, and rate limit settings.
- Upgrade octokit library which includes fix for a regular expression backtracking denial of service vulnerability, CVE-2025-25289.
- Display the replicas correctly in the scale page, by using current replicas count.
- Update serialize-javascript library from 6.0.1 to 6.0.2, as well as dependencies, to address CVE 2024-11831.
- When editing a role, the sidebar now behaves appropriately when adding or removing a policy.