25 April 2024 - v4.10.9

Intake Head service, continuous control loop operator, ORDER BY LIMIT optimization

NOTICE:
This version of Hydrolix uses Unified Authentication by default, which is a breaking change for existing customers using Basic Authentication. To maintain current behavior when upgrading a cluster from an earlier release, ensure that unified_auth: false is set in your cluster's configuration.

NOTICE:
This version of Hydrolix requires PostgreSQL version 11 or higher. We strongly recommend you use PostgreSQL 13 or higher for easy upgrading. If you are using PostgreSQL version 11 or 12, enable the ltree extension as superuser, and consider upgrading to an up-to-date version of PostgreSQL.

NOTICE:
RBAC Version 2, introduced in Hydrolix version 4.6.5, has tighter security than RBAC Version 1. In particular, read-only roles have more limited visibility of the system.

NOTICE:
If you are using batch ingest, and need to roll back to 4.8.x after upgrading to this version, contact Hydrolix Customer Success before rolling back.

Upgrade on GKE:

kubectl apply -f "https://www.hydrolix.io/operator/v4.10.9/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

Upgrade on EKS:

kubectl apply -f "https://www.hydrolix.io/operator/v4.10.9/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

Further upgrade instructions can be found in the Update to v4.10 guide.

Notable New Features

  • Intake-Head service
    • Re-written streaming ingestion service that removes queuing (and reliance on Redpanda), increasing efficiency and simplifying design.
  • ORDER BY LIMIT optimization
    • Order-by limit queries perform more aggressive partition and block-level pruning, increasing performance while maintaining data integrity.
  • Continuous control loop operator
    • Rather than running just once per deployment, the operator now runs continuously. NOTE: temporary manual changes to deployments will now be overwritten by this control loop. When making manual changes, scale the operator to 0 with kubectl scale --replicas 0 deployment/operator

General

  • API: Add PATCH support for organization query options API endpoint
  • API: Extended times in default table stream settings
  • API: Added logging for uploading, updating, and deleting dictionary files
  • API: Improved invitation API
  • API: Updated HTTP OPTIONS responses for SIEM/Kinesis/Kafka
  • API: Summary tables' parents must be valid UUIDs of existing tables
  • Control: Added monitor_ingest_timeout tunable
  • Control: Added skip_init_turbine_api tunable for skipping database migrations in init-turbine-api
  • Control: Hydrolix operator now runs in a continuous control loop
  • Core: Updated AWS SDK for C++ to version 1.11.285
  • Data: Remove unused merge tunables
  • Data: All combinations for role and static credentials for Kinesis are accepted
  • Data: Improved throughput and parallelization of HDX sink, new metrics
  • Data: Optimized catalog requests pagination, replacing OFFSET/LIMIT with key-set
  • Data: Child table settings are used for determining max_partitions
  • Data: Added support for new Intake Head pod type

Notable Bug Fixes

  • API: Storage map PUT and PATCH behavior works as expected
  • API: Jobs listing endpoint now lists properly according to RBAC
  • API: Deleted jobs no longer impact performance of RBAC
  • API: Jobs are now hard deleted instead of soft deleted
  • API: Purging batch jobs also deletes their sources
  • API: Alter job endpoint validates table and project ids
  • API: Fixed audit log errors
  • API: Dictionary schema definitions are now validated
  • Control: Fixed assorted Traefik availability issues
  • Control: Fixed unable to configure k8s default storage class
  • Control: Fixed unexpected restarts
  • Control: Fixed version-service high CPU usage
  • Control: Monitor-ingest pod no longer crash loops
  • Control: Operator now reloads configuration when changed
  • Control: Made traefik-cfg more robust, less likely to produce errors
  • Control: Reduced contention between Hydrolix Operator and Kubernetes HPA
  • Control: Re-enabled tunables for disabling cron jobs
  • Control: Alter peer no longer crashes due to missing run-with-indexer
  • Core: Fixed partition path logic for non-standard paths during ALTER
  • Core: Fixed data_path and storage_id in catalog table for ALTER queries
  • Core: settings.summary.enable flag is now disregarded by core, allowing querying
  • Core: Bugfix for ClickHouse client ACCESS_ENTITY_NOT_FOUND error
  • Core: Fixed bug in merge peer concerning denullify columns
  • Core: Fixed ORDER BY LIMIT optimization for functions in WHERE
  • Core: Fixed query peer context-related segfault
  • Core: Query peer dictionaries and configs are now loaded before peer is marked as available
  • Core: Empty partition dirs are removed when LRU cache is disabled
  • Data: Fixed rejector panic on nil event, SIEM hang
  • Data: Added more in-depth metrics around object storage transport
  • Data: Reaper is more tolerant of errors
  • Data: Fixed conflict on crunchydata update
  • Data: Fixed further panic in Kinesis shutdown due to indexer coordination
  • UI: SIEM edit form now loads
  • UI: Rendered all the pools, not just the query-related pools
  • UI: Made small fixes to profile settings page
  • UI: Included error message for alter jobs
  • UI: Setting default bucket for single table now works