Skip to content

v5.10.3

Notable new features⚓︎

Anomaly Detection is now in limited availability⚓︎

  • Hydrolix Anomaly Detection identifies unusual patterns in high-volume log data across multiple dimensions, reducing alert fatigue and accelerating incident response.
  • Anomaly Detection provides AI-powered correlation, natural language summaries, root cause identification, and recommended remediation steps. Use a Hydrolix managed large language model (LLM) or bring your own: GPT-4o, Claude, Gemini, Llama, or any OpenAI-compatible model.
  • Contact Hydrolix Support to enable Anomaly Detection. See Anomaly Detection for details.

Hydrolix MCP server enabled by default⚓︎

  • Hydrolix now enables the remote deployment of the Model Context Protocol (MCP) server by default. AI assistants and developer tools that support MCP can connect to a Hydrolix cluster to run natural-language queries, explore schemas, and manage cluster resources. See the MCP server documentation for details.

Automatic indexing for JSON columns⚓︎

  • Hydrolix now indexes JSON columns that contain homogeneous fields or simple arrays, improving query performance without schema changes or manual configuration.
  • Hydrolix creates these indexes during data ingestion.

Independent operator and stack version control⚓︎

  • Hydrolix now supports independent control over the operator version and the Hydrolix image manifest version. A newer operator can manage an older Hydrolix installation by specifying the version parameter in the cluster spec.
  • SRE teams can apply operator fixes and improvements without upgrading the full Hydrolix stack, reducing deployment risk and enabling faster rollout of operational fixes.

Kibana Gateway support for summary tables⚓︎

  • The Kibana Gateway now detects summary tables automatically by querying the Hydrolix Config API. Previously, summary tables required a manual flag in the Kibana Gateway configuration. Summary tables are now visible and queryable in Kibana without additional setup.
  • Hydrolix v5.10 includes Kibana Gateway v1.1.23.

New shard key performance mode⚓︎

  • Introduced a performance sharding mode that uses CRC32 hashing with seven fixed buckets, allowing multiple shard key values to share a bucket. This cuts cloud storage LIST operation costs and improves merge efficiency for high-cardinality shard key columns like customer_id or session_id.
  • Three new table settings control sharding behavior:
    • enable_sharding: enables or disables sharding. When true, both shard_key and shard_key_algo must be set.
    • shard_key_algo: sets the sharding algorithm: "strict" (one bucket per unique value) or "performance" (CRC32 mod seven buckets). Switch between modes as needed, but once set, shard_key_algo can't be cleared.
    • legacy_sharding: read-only; marks tables that use the legacy sharding implementation.
  • On upgrade, Hydrolix migrates existing tables with a shard_key to shard_key_algo = "strict" with legacy_sharding = true.
  • See Table Settings for configuration details and examples.

Breaking changes⚓︎

  • Grafana minimum version and plugin configuration changes

    Hydrolix now requires Grafana v12.3.1 or later (Enterprise or OSS) and has changed the Grafana plugin installation method. The operator no longer installs plugins using the GF_PLUGINS_INSTALL environment variable. Instead, plugin configuration uses the grafana_config.plugins dictionary. The grafana-clickhouse-datasource plugin is no longer installed by default; hydrolix-hydrolix-datasource is now installed by default instead. Review Grafana plugin configuration before upgrading and update any automation that relies on the previous install method.

    See Grafana minimum version requirement for upgrade steps.

  • Shadow table field rename: transform_id to transform

    The shadow_table configuration field transform_id is now transform. The transform field accepts either a valid UUID or a transform name. Update any automation, scripts, or API calls that use transform_id in shadow table configurations.

  • CREATE DICTIONARY and CREATE FUNCTION SQL statements are now blocked

    Hydrolix now returns QUERY_NOT_ALLOWED for CREATE DICTIONARY and CREATE FUNCTION SQL statements. This is a security fix for a server-side request forgery (SSRF) vulnerability identified during a penetration test. Use the Hydrolix API or UI to manage dictionaries and functions instead. Remove any automation or test scripts that issue these statements directly.

  • System resources hydro.logs and megaTransform are now protected

    The hydro.logs table and the megaTransform transform are now protected system resources. The API blocks attempts to delete, rename, or update these resources, except settings fields. API calls that attempt these operations will return an error. Remove any automation that modifies or deletes these resources. The full list of protected resources is: project hydro, tables logs and monitor, transforms monitor_ingest_transform and megaTransform, and source hydrologs.

  • Kafka sources now require TLS 1.3

    Kafka sources using SASL_SSL now require the broker to support TLS 1.3 or later. Connections to brokers that support only TLS 1.2 will fail. Verify your Kafka brokers support TLS 1.3 before upgrading.

  • The efficiency metric has been removed

    Three new metrics replace the merge controller efficiency metric, providing better descriptions of different data set shapes:

    • row_weighted_efficiency
    • mem_weighted_efficiency
    • harmonic_efficiency

    Update any dashboards or alerts that reference the efficiency metric to use the new metric names.

  • Merge controller is now enabled by default

    The merge-controller component is now the default, replacing merge-head. Clusters with no explicit merge configuration will automatically use merge-controller.

    With merge-controller, the dependency on RabbitMQ for merge operations is removed. Previously, enabling merge-controller required explicit configuration.

    If you have automation or configuration that disables merge-controller or relies on merge-head behavior, review it before upgrading.

    merge-head is deprecated as of v5.10 and will be removed in v5.12. If your configuration explicitly uses merge-head, migrate to merge-controller before upgrading to v5.12.

Upgrade instructions⚓︎

Do not skip minor versions when upgrading or downgrading

Skipping versions when upgrading or downgrading Hydrolix can result in database schema inconsistencies and cluster instability. Always upgrade or downgrade sequentially through each minor version.

Example:
Upgrade from 5.7.95.8.65.9.5, not 5.7.95.9.5.

See also detailed Upgrade to v5.10 instructions.

Apply the new Hydrolix operator⚓︎

For self-managed installations, apply the new operator directly with the kubectl command examples below. For installations managed by Hydrolix-supplied tools, follow the procedure prescribed by those tools.

GKE⚓︎

Apply Operator on GKE
kubectl apply -f "https://www.hydrolix.io/operator/v5.10.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

EKS⚓︎

Apply Operator on EKS
kubectl apply -f "https://www.hydrolix.io/operator/v5.10.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

LKE and AKS⚓︎

Apply Operator on LKE and AKS
kubectl apply -f "https://www.hydrolix.io/operator/v5.10.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}"

Grafana minimum version requirement⚓︎

Review and remove Grafana image version pins before upgrade. The operator installs Grafana v12.3.1 during the upgrade to Hydrolix v5.10.3 and uses the non-deprecated plugin installation method available in Grafana v12.1.0+. This applies to both Grafana Enterprise and OSS, but only affects Grafana instances deployed by the Hydrolix operator.

Monitor the upgrade process⚓︎

Kubernetes jobs named init-cluster and init-turbine-api will automatically run to upgrade your entire installation to match the new operator's version number. This will take a few minutes, during which time you can observe your pods' restarts with your Kubernetes monitor tool.

Ensure both the init-cluster and init-turbine-api jobs have completed successfully and that the turbine-api pod has restarted without errors. After that, view the UI and use the API of your new installation as a final check.

If the turbine-api pod doesn't restart successfully, or other functionality is missing, check the logs of the init-cluster and init-turbine-api jobs for details about failures. This can be done using the k9s utility or with the kubectl command:

% kubectl logs -l app=init-cluster
% kubectl logs -l app=init-turbine-api

For additional help, contact Hydrolix support.

Rollback considerations⚓︎

See also detailed Rollback to v5.9 instructions.

Downgrading from v5.10.3 to v5.9.x requires a database schema migration. After applying the v5.9.x operator, open a shell in the turbine-api container and run:

Run Downgrade Migration
./manage.py release_5_9

For additional help, contact Hydrolix support.

Updates⚓︎

These changes include version upgrades and internal dependency bumps.

Config API updates⚓︎

  • Upgraded Python library dependencies.

    • boto3 v1.35.0 → v1.35.99
    • gunicorn v20.1.0 → v23.0.0
    • kafka-python v2.2.15 → v2.3.0
    • procrastinate[django] v2.14.1 → v2.15.1
    • psycopg[binary,pool] v3.2.12 → v3.3.0
  • Upgraded Django to address CVE-2025-64460 and CVE-2025-13372.

    • django v5.2.8 → v5.2.9

Cluster operations updates⚓︎

  • Upgraded AIOHTTP library to >= 3.12.14 to address CVE-2024-52303, CVE-2024-52304, and CVE-2025-53643. Upgraded many other Python library dependencies across many sub-components in the operational stack.

    Detailed listing of version upgrades
    • aiohappyeyeballs v2.4.3 → v2.6.1
    • aiohttp v3.10.10 → v3.13.2
    • aiosignal v1.3.1 → v1.4.0
    • attrs v24.2.0 → v25.4.0
    • boto3 v1.35.52 → v1.42.14
    • botocore v1.35.52 → v1.42.14
    • cachecontrol v0.14.3 → v0.14.4
    • cachetools v5.5.0 → v6.2.4
    • certifi v2024.8.30 → v2025.11.12
    • cffi v1.17.1 → v2.0.0
    • charset-normalizer v3.4.0 → v3.4.4
    • frozenlist v1.4.1 → v1.8.0
    • google-auth v2.35.0 → v2.45.0
    • idna v3.10 → v3.11
    • kubernetes-asyncio v31.1.0 → v31.1.1
    • msgpack v1.1.1 → v1.1.2
    • multidict v6.1.0 → v6.7.0
    • oauthlib v3.2.2 → v3.3.1
    • prometheus-client v0.21.0 → v0.21.1
    • propcache v0.2.0 → v0.4.1
    • pyasn1-modules v0.4.1 → v0.4.2
    • pycparser v2.22 → v2.23
    • pydantic-core v2.23.4 → v2.41.5
    • pydantic v2.9.2 → v2.12.5
    • pytest-asyncio v1.2.0 → v1.3.0
    • pyyaml v6.0.2 → v6.0.3
    • requests v2.32.3 → v2.32.5
    • rsa v4.9 → v4.9.1
    • s3transfer v0.10.3 → v0.16.0
    • six v1.16.0 → v1.17.0
    • typing-extensions v4.12.2 → v4.15.0
    • urllib3 v2.2.3 → v2.6.2
    • websocket-client v1.8.0 → v1.9.0
    • yarl v1.15.3 → v1.22.0

Security updates⚓︎

Security improvements and vulnerability fixes.

UI security⚓︎

  • Fixed Server-Side Request Forgery (SSRF) vulnerability (CWE-918) in the Hydrolix UI. Refactored route handling to implement a "safePaths" registry that validates URLs and query parameters before passing them to HTTP client libraries. This prevents user-controlled URLs from being passed directly to back-end services, which could allow attackers to make unauthorized requests to internal systems. The fix applies across all UI pages, including:
    • projects
    • queries
    • login
    • passwords
    • pools
    • jobs
    • credentials
    • invites
    • roles
    • policies
    • service accounts
    • transforms
    • dictionaries

Operator security⚓︎

  • Applied security-hardening configuration to managed Grafana deployments following Grafana's guidelines. Changes include hiding version information in login/index pages using auth.anonymous.hidden_version, and securing session cookies using cookie_secure: true. These settings reduce information disclosure risks and protect against cookie-based attacks.

Cluster operations security⚓︎

  • Upgraded Python dependencies in control plane services to address security vulnerabilities. Updated gunicorn to v23.0.0 to resolve three vulnerabilities: two with high severity and one with medium severity. Also updated urllib3, werkzeug, and requests to their latest versions.

Config API security⚓︎

  • Added rate limiting to the invitation resend endpoint to prevent resource exhaustion attacks (CWE-770). The endpoint now enforces a 24-hour cooldown period between invitation resends for the same invitation ID. This prevents malicious actors from repeatedly triggering email notifications and consuming system resources through the invite resend API.

  • Fixed a permissions scoping issue in the internal sqlperms endpoint. Table-level select_sql permission grants were incorrectly elevated to project-level access, allowing users with table-level grants to query other tables in the same project. The fix correctly scopes select_sql permissions to their assigned table.

  • Strengthened existing brute force protection on the login endpoint, making the protections more aggressive and ensuring they're applied consistently. Accounts are temporarily locked after consecutive failed authentication attempts. Note that if Keycloak or a cluster pod restarts while an account is locked, the lock is cleared.

Intake security⚓︎

  • Upgraded amazon-kinesis-client from v2.6.0 to v2.7.1 to address a security vulnerability flagged by automated scanning.

Improvements⚓︎

These changes improve behavior, resilience, or usability across components.

Core improvements⚓︎

  • Added hdx_verify_sql, a function that accepts a SQL query and returns success: true if the query is well-formed. Use it to validate transform SQL before saving. Other software systems can call it directly to validate SQL. The function doesn't check table permissions or DDL statements.

  • Added hdx_query_max_perc_before_external_sort, a query-level setting that controls the memory percentage threshold before sort operations spill to external storage. Set to an integer between 1 and 100 to enable spilling when memory usage reaches that percentage; set to 0 to disable external sorting. Defaults to 0. Can't be used with hdx_query_max_bytes_before_external_sort.

  • Implemented indexing for JSON columns with homogeneous fields and simple arrays, improving query performance. Hydrolix creates these indexes during data ingestion.

Cluster operations improvements⚓︎

  • Introduced logic into the traefik plugin hdx-auth for ingest and service RBAC support for parity with the hdx-traefik-auth sidecar. The plugin, not enabled by default yet, offers better performance and doesn't require a sidecar container.

  • Fixed startup and shutdown ordering for turbine indexers in merge-peer pods. During scale-down events, the indexer container could terminate before the merge-peer container finished processing, silently dropping in-flight merge work. Moving the indexer to init_containers ensures it starts before the merge peer and shuts down after it, following the same lifecycle pattern applied to intake-head pods in v5.9.

  • Improved flexibility of configuration for including metrics into the Tulugaq scraper when sending cluster metrics into a Hydrolix table. Users can limit the collector to metrics families and metrics with specific labels.

  • Increased the minimum Grafana version supported for operator-managed instances to v12.3.1. This applies to both Grafana Enterprise and OSS. Use the preferred plugin installation method and always use hydrolix-hydrolix-datasource.

  • Enhanced the grafana_config tunable to support customization of arbitrary Grafana configuration sections, allowing users to configure any Grafana settings through nested dictionaries. Updated Grafana plugin installation to use the preferred method and added validation for Grafana major version compatibility.

  • Introduced support for specifying an external PostgreSQL or MySQL database in the grafana_config tunable. Without the external database settings, Grafana uses prior behavior, creating a database in the cluster's PostgreSQL instance. This feature paves the way for incremental migration to in-cluster Grafana, without requiring a simultaneous database migration.

Intake improvements⚓︎

  • Added support for ingesting AWS CloudTrail audit logs directly. CloudTrail logs are split on the Records array into individual events without requiring custom preprocessing.

  • Improved resilience of the SIEM ingestion endpoint against malformed request bodies. The intake layer now handles bad JSON payloads from SIEM sources without crashing or dropping the connection.

  • Replaced a single efficiency metric with three metrics to better depict shapes of different data sets. Now, merge controller exposes distinct row-weighted, memory-weighted, and harmonic mean calculations of efficiency.

  • Added support for custom TLS certificates in Kafka sources using SASL_SSL authentication, enabling connections to brokers with self-signed certificates or hostname mismatches.

Operator improvements⚓︎

  • Added aks as a supported value for the kubernetes_profile tunable. Setting kubernetes_profile: aks enables AKS-specific operator configuration for:

    • PVC storage classes
    • Object storage zone endpoints
    • Load balancer settings
    • Scale profiles
  • Added an image_pull_secret tunable to the Hydrolix operator, enabling clusters to authenticate with private container registries.

  • Added validation for IP allowlist CIDR formatting. When the Hydrolix operator processes a cluster spec containing an invalid CIDR in ip_allowlist, it now emits a warning rather than silently accepting the malformed value. Deployment continues, but the warning surfaces during hkt validate. Both IPv4 and IPv6 formats are validated.

  • Enabled the Model Context Protocol (MCP) server by default for all Hydrolix clusters, including TrafficPeak. The MCP server allows AI assistants and development tools to interact with Hydrolix clusters programmatically, enabling features like natural language queries, schema exploration, and cluster management through AI interfaces.

Config API improvements⚓︎

  • Fixed a database connection accumulation issue in the Config API. The API previously opened a database transaction for every request, which caused idle connections to pile up and could eventually exhaust PostgreSQL's connection limit. Connections are now managed with targeted transactions only where needed. This change also updated transform conflict detection to use the columns table as the source of truth instead of the autoview, resolving an edge case race condition in summary table workflows.

  • Added a configurable PostgreSQL connection timeout using HDXAPI_PGCONNTIMEOUT, which defaults to three seconds. Improved transaction handling for operations involving S3 storage to prevent long-running database locks.

  • Updated shadow transform configuration to accept table and transform references by name or UUID instead of requiring cluster-specific IDs. The shadow_table object now uses table and transform fields (replacing table_id and transform_id) that accept either names or UUIDs, enabling reusable Configuration as Code (CaC) definitions across clusters.

  • Protected critical system resources from deletion or modification through the API. The hydro.logs and hydro.monitor tables and their transforms (megaTransform, monitor_ingest_transform) can no longer be deleted, renamed, or updated. Table settings can still be modified. Name collision prevention blocks creating new resources with protected names.

  • Improved performance of internal sqlperms endpoint to prevent query timeouts. After the addition of column permissions in v5.9, the sqlperms endpoint response time increased significantly on large clusters, causing timeouts that prevented queries from succeeding. The endpoint now responds in under two seconds, with an indefinite cache to maintain fast query startup times.

  • Added a description field to service accounts in the Config API and UI, making it easier to document the purpose of each account.

  • Added = to the set of allowed characters in bucket path configurations, supporting bucket paths that include query-style parameters.

UI improvements⚓︎

  • Added the ability to invalidate service account tokens from the Hydrolix UI. Tokens can now be revoked without deleting and recreating the service account.

  • Removed the 6-hour restriction on custom date ranges in the audit trail UI. Custom time ranges now support any period up to today.

Bug fixes⚓︎

Cluster operations bug fixes⚓︎

  • Improved detection and reporting of Kubernetes, restarting, and network errors in the tulugaq metrics scraping tool, which sends Prometheus metrics to a Hydrolix table.

  • Redirected requests for the in-cluster Grafana metrics endpoint to the main Grafana endpoint. This prevents exposure of Grafana's default-accessible metrics endpoint.

  • Improved authentication redirect behavior in the hdx-auth Traefik plugin. When unauthenticated users attempt to access protected pages, the system now stores the original URL and redirects them back to that page after successful login if they have the proper permissions, rather than always redirecting to the homepage.

  • Enhanced the scale_min profile to include Grafana and its dependencies. When both Grafana and the scale_min profile are enabled, Grafana now remains active with the resources and replicas defined in .spec.scale.grafana. This supports dedicated Grafana-only cluster deployments where Hydrolix services aren't needed.

  • Replaced Kubernetes readiness probes with startup probes for Turbine server containers, including query-peer, query-head, intake-head, merge-peer, and other turbine-based pods. Startup probes only execute during pod initialization rather than throughout the pod lifecycle, which is more appropriate since the health check verifies one-time initialization tasks such as dictionary and configuration loading. This also prevents race conditions where pods could pass readiness checks before Turbine containers finish initializing.

  • Corrected the Traefik settings for the incoming TCP for the MySQL and Thanos servers. The Traefik TLS passthrough settings were interfering with MySQL TLS negotiation on the query head. The fix also disables proxy v2 protocol support for routes with TLS disabled.

  • Fixed missing priorityClass configuration for the intake-peer, http-head, and intake-router intake components, and for pooled resources, that was causing Kubernetes to evict and reschedule pods. The operator now sets priorityClasses for managed services. Services not deployed by the operator, such as cAdvisor and Kube-State-Metrics, require manual priorityClass configuration.

  • Fixed HDXScaler not applying updates to the aggregation op parameter during configuration reload. Changes to op in the cluster configuration now take effect without restarting the scaler.

  • Fixed HDXScaler EWMA calculation producing incorrect or negative scaling values under CPU saturation, which caused data loss. Also fixed a thread leak and orphaned background worker processes left behind after configuration reload.

  • Fixed incorrect validation errors in the Operator validating webhook for storageClassNames and hdxscaler configuration fields. These fields previously triggered spurious warnings during cluster resource validation.

  • Fixed the job-purge periodic task skipping terminal jobs with a NULL updated_at value, causing them to accumulate in the catalog database.

UI bug fixes⚓︎

  • Fixed Hydrolix UI issues:
    • Transform templates can now be deleted from the UI.
    • Transform creation now works correctly.
    • Service account creation and deletion now work correctly.

Config API bug fixes⚓︎

  • Fixed a login failure on clusters deployed without HTTPS. In some HTTP deployment configurations, Django's protocol detection returned incorrect results, causing authentication to fail completely. The fix ensures cookies are set without the Secure flag when UI_PROTOCOL is http.

  • Improved validation and error reporting for source and bucket credential ID and name. The change helps avoid accidentally creating conflicting or ambiguous settings for autoingest on a table.

  • Implemented configuration support for the HTTP headers used for Dynamic Ingest Routing in the pool PATCH and PUT endpoints. Earlier, changes to pools using the Config API would drop cluster spec YAML routing settings.

  • Corrected an inconsistency in file reporting between Azure and S3 file list operations. This allows configuration cleanup code to properly prune old config blobs when using Azure as primary storage.

  • Fixed shadow table transform configuration to correctly accept and store the rate sampling parameter, which POST and PUT requests previously ignored. Without this fix, the rate always defaulted to 0.0.

  • Fixed the /config/v1/invites endpoint to prevent duplicate user creation attempts. Password validation failures now return appropriate error codes instead of 500 errors.

  • Fixed summary table creation returning a 500 error when the SQL body was empty or blank. The endpoint now returns a 400 error with a descriptive message.

  • Prevented resending invitations to disabled users. The system now returns an error when attempting to send an invite to a user account that has been disabled.

  • Fixed a backwards compatibility issue with SIEM sources. PATCH requests to existing SIEM sources no longer require the access_details field, restoring the ability to update SIEM sources created before the credential model change. When upgrading, Hydrolix migrates SIEM sources with access details and secret keys exposed directly on the API to a new SIEM credential object associated with the source. SIEM source credentials can now also be updated after creation.

  • Fixed an error when setting an output column description to an empty string. The API now accepts empty strings as valid values for the column description field.

  • Fixed autoingest table PATCH requests returning 400 errors when source_credential or bucket_credential fields were null. Tables with autoingest configured without explicit credentials can now be updated without errors.

  • Fixed SIEM secret migration failing during init-turbine-api when the secret value was empty. The migration now falls back to a safe default value instead of crashing.

  • Removed the write-only hdx_query_max_before_external_group_by query option that caused infinite CaC drift. Use hdx_query_max_bytes_before_external_group_by or hdx_query_max_perc_before_external_group_by directly instead.

Intake bug fixes⚓︎

  • Fixed batch job API routing and status reporting issues. Jobs now return "Pending" status immediately upon creation, preventing race conditions where status queries would return 404 errors. Improved job recovery to properly handle stale work items.

  • Fixed an ingestion crash on tables with ip or uuid columns configured with denullify: true and a default value. Records containing null in those columns triggered a ClickHouse SQL parse error and an indexer crash. The fix correctly quotes IPv6 addresses and UUIDs in the generated COALESCE expressions.

Core bug fixes⚓︎

  • Fixed Full-Text Search (FTS) LIKE queries that include spaces or other word-boundary characters. Previously, a pattern like WHERE message LIKE 'Processed %' returned 0 rows even when matching data existed: the FTS tokenizer treats spaces as word boundaries and stores no tokens spanning them. The fix correctly tokenizes multi-word patterns and intersects the results. See Full-Text Search documentation.

  • Corrected discrepancy between handling of integer column NULL values in Hydrolix partitions and ClickHouse library access code. By default, all columns are nullable. The discrepancy manifested only when using the denullify read schema attribute with stored NULLs.

  • Changed query system to return QUERY_NOT_ALLOWED for SQL statements CREATE and REPLACE for DICTIONARY and FUNCTION objects. This removes risks of a Server-Side Request Forgery attack, CWE-918. Dictionary and function management features remain supported in the Config API.

  • Added code to accept and gracefully ignore MySQL-specific handshake settings in MySQL connector code. The Tableau MySQL client can now connect.

  • Fixed INSERT INTO operations to properly set storage_id and metadata.source fields in catalog rows. Previously, partitions created with INSERT INTO had NULL storage_id values, which caused merge operations to fail with "Storage not found" errors because the merge process couldn't determine which storage location contained the partition data. The system now uses the table's configured default storage location when creating partitions.

  • Fixed Full-Text Search (FTS) queries that use separator characters other than spaces, such as hyphens or underscores. Queries containing these separators now return correct results.

  • Fixed a crash and string dictionary corruption that occurred during merge and compaction when processing empty data blocks.

  • Fixed cross-account S3 access for AWS customers using aws_iam_role credentials. Query, indexing, and merge operations previously failed with 403 errors when accessing S3 buckets in a different AWS account. These operations now correctly assume the configured IAM role through STS.