14 May 2024 - v4.12.3

Authentication Activity Endpoint, PGX Pool, Usagemeter Updates

NOTICE:
If you need to roll back to 4.10 after upgrading to this version, review the downgrade procedure before doing so.

Upgrade on GKE:

kubectl apply -f "https://www.hydrolix.io/operator/v4.12.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"

Upgrade on EKS:

kubectl apply -f "https://www.hydrolix.io/operator/v4.12.3/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"

Notable New Features

  • Authentication Activity Endpoint
  • PGX Pool
    • Connections to Postgres are more tunable, reliable, and observable with extra Prometheus metrics.
  • Usagemeter Improvements
    • It's now easier to determine resource usage on different cloud providers.

General

  • API: Ensured that bruteForceProtected is enabled on Keycloak realm
  • API: Optimized Keycloak accesses when authenticating
  • API: Verify dictionary creation
  • API: HTTP API supports cookie-based authentication
  • API: /users endpoint can now filter out unverified users via emailVerified = True
  • API: Added health check logging
  • API: Better support for different datatypes when using the /generateschema endpoint
  • API: Updated Keycloak and related libraries
  • API: HTTP API for batch jobs no longer accepts PUT or PATCH methods
  • API: Increased readiness check timeout to 10 seconds
  • API: Multiple users can be invited at once
  • API: Better errors when password complexity rules are disobeyed
  • API: Better shard key validation
  • API: Added validation to block changing field types for transforms
  • API: Updating of entire object when editing column_value_mapping for buckets is now allowed
  • Control: Improved load balancing for stream heads
  • Control: Readiness check and retry logic added to Traefik
  • Control: On startup, Vector only reads from ends of files it finds to avoid logging surge
  • Control: More verbose database connection failure log messages
  • Core: StrDict decoding speed improved by 5x
  • Core: DNS queries now use domain search to support minio
  • Core: Error message when creating indexes of maps with arrays
  • Data: Improved catalog connection pooling and improved indexing resiliency
  • Data: Introduced persistent auditing for ALTER jobs
  • Data: Provided backlog handling with adjustable prioritization TODO: link to docs
  • Data: Implemented s3 download buffer to reduce memory and CPU usage of large multitenant clusters
  • UI: Transform download now uses dynamic vendor names in source tabs
  • UI: Pending invitations are displayed on a new Security tab
  • UI: Summary table page analyzes candidate fields
  • UI: 401 exceptions are made for /version to prevent redirects
  • UI: Added merge settings to table UI
  • UI: Query analysis tab shows details of previous query

Bug Fixes

  • API: Keycloak no longer wedges due to PostgreSQL version mismatch
  • API: Storage objects are always updated when force_operation=1
  • API: Pending jobs on tables are cancelled and deleted when deleting those tables
  • API: Bearer tokens are no longer verified on the /login endpoint
  • API: Keycloak no longer fails after startup
  • API: Config object names are validated, disallowing creation of projects and tables with bad names
  • API: Fixed auto-view creation race condition
  • API: Keycloak database backups stream backup file directly to bucket
  • Control: Several usagemeter bugs fixed
  • Control: Kubernetes default storage class can now be configured
  • Control: Traefik will no longer send new traffic to terminating pods
  • Control: Lead pod respects pool target CPU
  • Control: Zookeeper uses much less disk space when used as a checkpoint database for Kinesis
  • Control: HPA's are no longer stuck at 0 after scale off/on cycle
  • Control: Fix crash loop in monitor-ingest service, adding logging and error handling
  • Control: Fewer workers are used for the version service for lower CPU
  • Control: Certificates are renewed when secret containing renewed Grafana certificates are updated
  • Control: Operator no longer unexpectedly restarts services
  • Control: Fixed Traefik restart errors, upgraded to 2.11
  • Core: Turbine server now reads [telemetry] settings
  • Core: Fixed race condition with concurrent queries
  • Core: Adding aggregate columns after data has been ingested has been fixed for summary tables
  • Core: Prometheus now parses core metrics properly
  • Core: Fixed a segfault due to broken postgres connection during startup
  • Core: Port is now derived from S3 storage URL
  • Data: Summary settings now behave properly
  • Data: Fix rejector panic on nil event, SIEM hang
  • Data: Metric server is now started by the alter-peer process
  • UI: default_storage_id will now populate the bucket settings sidebar
  • UI: Default table bucket can now be set
  • UI: Maximum age setting for table flush settings for cold data now displays correct units