Skip to content

v5.0.0

ClickHouse upgrade, new Shadow Tables feature, Native MySQL client support, better API documentation, easier Kibana integration via Quesma, and Hydrolix Tunable Names

Notable New Features⚓︎

  • ClickHouse Upgrade
  • The ClickHouse library has been upgraded to version 24.8.6.70. In addition to performance improvements and bugfixes, the following functions are now available.
    • Top K: approx_top_k, topKWeighted, and approx_top_sum
    • String related: base64URLEncode, base64URLDecode, tryBase64URLDecode, and groupConcat
    • Time related: toMillisecond
    • Set handling: groupArrayIntersect
    • Windowing: percent_rank
    • and more...
  • Native MySQL client support
  • Client applications can now connect to and query a Hydrolix cluster with MySQL clients. The MySQL server listens on tcp/9004. This opens up more integration possibilities.
  • Better API Documentation
  • The API documentation available from the /config/schema/ endpoint has been reorganized and improved, making way for more complete API documentation in future releases.
  • Simplified Kibana, Quesma, and Elasticsearch Integration
  • Deployment of Kibana, Quesma, and Elasticsearch is now provided automatically within Hydrolix.
  • Hydrolix Tunable Names (HTN)
  • Using a structured naming pattern, all tunables can be applied to multiple services, pools and containers. When defined, an HTN htn:<service>:<pool>:<container>: <value> takes precedence over traditional key-value tunables.

Breaking Changes⚓︎

🚧 GET table_query_options has been removed

The /table_query_options API path has been removed. The same functionality is now available at the more flexible /query_options path and also works on both tables and projects. See the API documentation for more information:

[Table Query Options][ref config_v1_orgs_projects_tables_query_options_hierarchy_retrieve] Project Query Options

Upgrade⚓︎

Do not skip minor versions when upgrading or downgrading

Skipping versions when upgrading or downgrading Hydrolix can result in database schema inconsistencies and cluster instability. Always upgrade or downgrade sequentially through each minor version.

Example:
Upgrade from 5.5.05.6.x5.7.4, not 5.5.05.7.4.

Upgrade on GKE⚓︎

kubectl apply -f "https://www.hydrolix.io/operator/v5.0.0/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

Upgrade on EKS⚓︎

kubectl apply -f "https://www.hydrolix.io/operator/v5.0.0/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

Upgrade on LKE⚓︎

kubectl apply -f "https://www.hydrolix.io/operator/v5.0.0/operator-resources?namespace=$HDX_KUBERNETES_NAMESPACE"

(In the sections below, remove the temporary links before publishing them. Keep this version around, though, it's useful for quick research later.)

Changelog⚓︎

General⚓︎

  • Hydrolix Engine
  • Updated ClickHouse from version 23.8.10.43 to 24.8.6.70. We also updated the following libraries:

    • openssl 1.1.1n -> 3.3.2
    • absl 20211102.0 -> 20240722.0
    • fmt 8.1.1 -> 9.1.0
    • libpq 16.4 -> 16.5
  • Data Transformation

  • Introduced UI support for creating, copying, viewing and customizing transform templates.
  • Allowed UI download of summary table transforms.
  • Introduced support for display and edit of WURFL settings while managing transforms.

  • Ingest

  • Added Shadow Tables feature. Shadow Tables only receive a certain percentage of randomly-sampled data from the input source.
  • Amazon Data Firehose can now decode more than one data object from a single data segment.
  • Improved the SaaS metering functionality ("Usagemeter") by adding better housekeeping and cleaning out reported data older than the number of days specified in the usagemeter_preserve tunable.
  • Updated the idna library from version 0.5.0 to 1.0.3 to address a security vulnerability.

  • Query

  • Added TCP port 9004 listener for incoming queries using the MySQL dialect. This enables MySQL-compatible query tools. Queries are proxied to the ClickHouse query engine.

  • Cluster Operations

  • Created a new tunable turbine_api_require_table_default_storage requiring explicit storage maps at table creation. This is especially useful for clusters using storage mapping.
  • Removed partition-vacuum. It was replaced with a lightweight partition-cleaner.
  • Switched partition-cleaner to a continuously running service instead of a scheduled job. Removed tunables partition_cleaner_enabled and partition_cleaner_schedule, decrease memory usage, and introduce bulk_delete_* metrics.
  • Converted hard-coded query peer liveness parameters into two new tunables, query_peer_liveness_initial_delay and query_peer_liveness_probe_timeout.
  • Added a tunable prometheus_curated_configmap to offer cluster operators dynamic control over which metrics are exported to Prometheus.
  • Added tunables disable_traefik_mysql_port, mysql_port, and mysql_port_disable_tls used by the traefik server for MySQL configuration.
  • Introduced Hydrolix Tunable Names (HTN), a flexible, hierarchical system for configuring tunables across an entire cluster.
  • Introduced two new tunables to improve control over the behavior of rolling updates: rollout_strategy_max_surge and rollout_strategy_max_unavailable. These are especially useful for deployments running near maximum capacity.
  • Provided automatic deployment of Kibana, Quesma, and Elasticsearch within Hydrolix using the data_visualization_tools and quesma_config tunables.

  • Integrations

  • Removed unnecessary references to shard keys throughout the Spark connector integration.

  • API

  • The API documentation available from the /config/schema/ endpoint has been reorganized and improved, making way for more complete API documentation in future releases.

  • UI

  • Improved Advanced Options rows in tables, which can now be clicked, bringing up an editor in the sidebar for better visibility.
  • Introduced a multi-selection dropdown on the System Health page that includes a new All option for logs.
  • Added a refresh interval of three minutes for the table health widget.
  • Added a Delete Project page to improve the deletion process. The page shows statistics about ingest latency and size, and warns before deletion. Your session must have is_superuser=true to be allowed to delete projects.
  • Added a Project Health page to show relevant information, such as statistics about ingest latency and size per table.
  • Upgrade Next.js from 14.2.20 to 14.2.22 to address security issues.

Bug Fixes⚓︎

  • API
  • Handled null storage IDs correctly in API for the presigned URL feature.
  • Allowed users with single project or table permissions to see projects and tables to which they have access. Formerly, they saw nothing.
  • Allowed users with all permissions to a project to use the /projects endpoint for listing projects. Earlier, this was endpoint was forbidden.
  • Ensured only a single use for any invitation. This fixes an issue allowing a second claim on the same invitation.
  • Introduced a script that automatically runs on upgrade to patch up existing views that were affected by a previously fixed bug involving datetime and datetime64.
  • Required force_operation to delete storage if it was used in a column_value_mapping.
  • Accepted blank entries in the rust URL pre-signer for endpoint_url.
  • Fixed a bug involving burst and limit when creating projects, tables, and transforms.
  • Allowed the ?force_operation=true query string parameter to bypassing verification when updating summary table SQL. This allows recovery from poor configuration.

  • Authentication and Permissions

  • Fixed a breaking interaction between unified auth and query string auth.
  • Removed visibility for with the super_admin role in Hydrolix will no longer see projects that have been marked as deleted.

  • Configuration and Control

  • Accepted convertible types for core tunables via query_options endpoint. For example, hdx_query_unlimited_cnf is a boolean but can be "0" (false) or "1", "99", etc (true).
  • Ensured that the Hydrolix operator monitors changes to curated and applies immediately to avoid k8s and operator desynchronization. This also obviates the need to restart the operator.
  • A liveness check has been added to traefik-cfg so it will terminate instead of hanging while waiting for timeouts.

  • Core and Query

  • Corrected Spark connector integration to returns an empty ClickHouseRecord to the client, indicating absence of result. Earlier, it would return null, provoking a null pointer exception.
  • Filtered storage IDs requiring presigning to fix Internal Server Error from /catalog_urls.
  • Added resilience improvement when a disconnected peer is canceling a query. No need to crash in sensitive routine under cancellation conditions.
  • Corrected a rare thread safety issue in ClickHouse 24.8.6.70 upgrade.
  • Reserved use of hdx_query_output_file_enabled=1 as a query-level setting, not on table, project, or org. Improve consistency if hdx_query_output_filename is empty, by generating timestamped filenames.
  • Fixed a rare, load-induced segfault crash in query-head by wrapping cancellation detection in a mutex.
  • Replaced std::string with fixed-length char array to avoid deallocation failures under heavy memory pressure. This should reduce or eliminate exit codes of 139 also lacking stack trace output.
  • Prevented full table scan when SELECT queries with LIMIT find no data in the specified time range.
  • Fixed an issue where a segmentation fault occurred when running a catalog query if the turbine service was started before the postgresql service.
  • Fixed a segfault triggered when using the empty() SQL function with map columns.
  • Detected cases of INSERT INTO an invalid hdx_storage_id. Now exits instead of displaying a long error message.
  • Fixed a memory leak involving the AWS API. Aws::ShutdownAPI() is now called when Hydrolix is finished using it.

  • Merge and Data Lifecycle

  • Included the shard key along with time range when determining eligibility for partition merging.

  • Ingest

  • Avoided a possible segfault in partition handling during parse error checking. This was visible in both intake-head and merge-peer.
  • Corrected a reference counting error triggered in error-handling during batch ingestion with summary tables enabled. Symptom was infinite loop of "Outstanding partition status."
  • Stopped leaking file descriptors inside connection pool management during batch, alter and merge, by always closing response bodies.
  • Improved cleanup of orphaned files when upload to storage fails. Previously, a batch-peer might hang on shutdown.
  • Corrected a soft-delete interaction with Azure in which files set as 'permanent delete' remained. They will now be deleted.
  • Introduced better reporting of replicas when using a GET pool endpoint. The response includes count of current_replicas, and a range for replicas, matching the setting in the hydrolixcluster.yaml file.
  • Allowed multiple autoingest sources to use the same transform. This reduces resource costs during ingestion.
  • Updated autoingest to normalize URI filtering across Azure, GCP, and AWS. The overall behavior remains the same.
  • Added richer error reporting during batch autoingest, including bucket name, error message, and other fields. Previously, only a short text message was returned.
  • Corrected JSON-encoded storages escaping for the batch-head to read. This prevents data loss with autoingest.
  • Updated golang.org/x/net from v0.28 to v0.33 to address a security vulnerability that could allow Denial of Service (DOS) attacks.

  • UI

  • Avoid losing user input in the Table Health UI widget sidebar during re-rendering or when user switches to other tabs or windows.
  • Omit browser locale information to avoid causing bad SSO redirect.
  • Remove duplicated text entry block for schema definition when editing dictionaries.
  • Only display WURFL fields if WURFL is enabled.
  • Fix multiple display issues in sidebars when surfacing validation results from API to user. Pages affected were data lifecycle management and also stream, merge, flush, bucket, and rate limit settings.
  • Upgrade octokit library which includes fix for a regular expression backtracking denial of service vulnerability, CVE-2025-25289.
  • Display the replicas correctly in the scale page, by using current replicas count.
  • Update serialize-javascript library from 6.0.1 to 6.0.2, as well as dependencies, to address CVE 2024-11831.
  • When editing a role, the sidebar now behaves appropriately when adding or removing a policy.