20 December 2024 - v4.21.2
SSO, Query Option Hierarchy Display
Notable New Features
- SSO
- Authenticate to Hydrolix with your preferred Single-Sign-On provider. Any Keycloak-supported identity provider can be used, such as Google Workspace or GitHub. See our SSO documentation for more details.
- Query Option Hierarchy Display
- The query section of the UI has a new tab that shows the final calculated query options for the query. It also shows the hierarchy of values from organization, project, and table to illustrate how the final value was obtained.
- The query section of the UI has a new tab that shows the final calculated query options for the query. It also shows the hierarchy of values from organization, project, and table to illustrate how the final value was obtained.
Breaking Changes
-
EKS and GKE Clusters Need More Permission
To continue ingesting data, this version of Hydrolix needs access to the Kubernetes API to perform certain operations. For Amazon EKS and Google GKE deployments, perform these steps before you upgrade to version 4.21.2:
-
EKS
First, reset your your
SA_POLICY_DOC
environment variable you created during Hydrolix setup. The SA_POLICY_DOC now has a new line in it for ingest services:"system:serviceaccount:${HDX_KUBERNETES_NAMESPACE}:ingest"
Once you've set your
SA_POLICY_DOC
variable, us it in this command to set the new role policy:update-assume-role-policy --role-name "${HDX_KUBERNETES_NAMESPACE}-bucket" \ --assume-role-policy-document "${SA_POLICY_DOC}"
-
GKE
A new service account policy binding needs to be added with this command:
gcloud iam service-accounts add-iam-policy-binding ${GCP_STORAGE_SA} \ --role roles/iam.workloadIdentityUser \ --member "serviceAccount:${PROJECT_ID}.svc.id.goog[${HDX_KUBERNETES_NAMESPACE}/ingest]" \ --project $PROJECT_ID
-
-
Web Browser Cookies May Need to Be Cleared
When visiting the Hydrolix user interface for the first time after upgrading, your web browser cookies may need to be cleared. Make sure you clear only the cookies for the hostname of your Hydrolix UI. Here are links to instructions for Google Chrome, Mozilla Firefox, and Safari.
Upgrade
Upgrade on GKE
kubectl apply -f "https://www.hydrolix.io/operator/v4.21.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"
Upgrade on EKS
kubectl apply -f "https://www.hydrolix.io/operator/v4.21.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"
Upgrade on LKE
kubectl apply -f "https://www.hydrolix.io/operator/v4.21.2/operator-resources?namespace=$HDX_KUBERNETES_NAMESPACE"
Changelog
General
-
API
- SSO Authentication is now possible through any Keycloak-supported indentity provider. See the SSO documentation for more information.
- We've added a new configuration API endpoint,
/config/v1/users/current
, that returns details about the currently authenticated user. - To increase security and help prevent DOS attacks, timeouts have been added to all outgoing HTTP requests.
- A new
/v1/orgs/$ORG_ID/config_blob/
API endpoint returns the configuration of your Hydrolix cluster.
-
Control
- To support running Kafka Peers on IPv6 on certain networks,
kafka-peer
s can now run withhostNetwork: true
. - Pools can now be specified as a dictionary (keyed on pool name) within the Hydrolix cluster configuration. Previously, they could only be specified as a list.
- The
turbine-api
pod now deploys its ownquery-head
container to allow it to make validation queries. If a user scales down their cluster’squery-head
deployment to zero replicas, this will no longer have unintended side-effects outside of query functionality. - The MaxMind Geo database is now better kept up-to-date with each release.
- To support SSO, unified authentication is now handled on a more granular per-route basis instead of a per-service basis.
- To increase resilience in unreliable Kubernetes environments, a new HDX-Node daemonset has been introduced which monitors node-to-node connectivity. It also automatically cordons and delete nodes when they're unreachable.
- The catalog PostgreSQL port can now be specified using the
catalog_db_port
tunable. - The new
hydrolix.io/ignore-diff=true
annotation in the Hydrolix operator prevents manual changes in Kubernetes resources from being reverted by the operator. Read more about it in Manual Resource Configuration. - A new
prometheus_scrape_interval
controls the Prometheusscrape_interval
global value.
- To support running Kafka Peers on IPv6 on certain networks,
-
Core
- Added support for materialized views. Summary SQL statements only receive a GROUP BY expression if aggregation keys are found in the SQL.
- To support an upcoming release of the Spark connector, the API can now generate pre-signed URLs for partitions, simplifying object storage authentication.
- Stack traces during queries are now suppressed by default. To re-enable, append
SETTINGS hdx_query_debug=1
to your query. - The
indexOf()
function now has index-only scan support to increase performance. - The turbine storage subsystem keeps an internal cache of authorization tokens. Log messages now differentiate between errors in the system and similar warnings involved in authorization token cache refreshing.
-
Data
- New optional Partition Cleaner works as a cron job, replacing Partition Vacuum to provide an alternate partition deletion system. It's disabled by default for now and controlled by three tunables:
partition_cleaner_enabled
,partition_cleaner_schedule
, andpartition_cleaner_dry_run
. - Implemented command-line
report-usage
tool to report and re-upload cluster usage data. - To help troubleshoot and reduce out-of-memory conditions for intake services
intake-head
,kinesis-peer
,kafka-peer
, andakamai-siem-peer
, we can optionally detect OOM scenarios in calls to the turbine indexer. If detected, we split the bucket of data into smaller files and retry. This is turned off by default, and can be enabled by using thek8s_oom_kill_detection_enabled
tunable. - Context has been added to merge log lines so both input and output partitions names occur in the same line. This will help debugging partition issues.
- The Rust SQL library
sqlx
has been updated from 0.7.4 to 0.8.1 to address SQL injection vulnerabilities. - A new
manifest.json
file format and versioned config filename has been rolled out across Data, Core, and API. This versioning makes configuration file changes easier to detect. - Upgraded Go from 1.21.0 to 1.23.2.
- All intake HTTP services now have a read/write timeout of 30 minutes to help guard against attacks that open many simultaneous connections.
- Incoming data can now be written to an array of storages in a random order using the
spread_list
directive in the hydrolixcluster.yaml file. This can help avoid cloud storage throttling.
- New optional Partition Cleaner works as a cron job, replacing Partition Vacuum to provide an alternate partition deletion system. It's disabled by default for now and controlled by three tunables:
-
UI
- The query section of the UI has a new tab that shows the final calculated query options for the query. It also shows the hierarchy of values from organization, project, and table to illustrate how the final value was obtained.
- The administrative UI's favicon.ico has been updated.
- Forms for SIEM have been updated, adding "copy" buttons and allowing client secret fields updates.
- Bucket creation and editing UI now supports the "westus3" Azure region.
- A robots.txt file has been added to the Hydrolix Administrative UI to keep web search indexers away.
Bug Fixes
-
API
- Transforms containing epochs were sometimes incorrectly identified as conflicting with existing transforms. Datetime64 and datetime handling has been changed to fix this.
-
Control
- Security issues from a recent penetration test were addressed.
- The default scale profiles now have
intake-head
enabled, and other scale profile cleanup has been done. - The
otel_endpoint
tunable now works correctly.
-
Core
- The SQL
url()
andurlCluster()
functions are now disabled to provide better cluster security. Use of these functions will now return aFUNCTION_NOT_ALLOWED
error. - Query Peers no longer attempt to perform catalog queries, fixing an occasional segfault if Query Head partition evaluation is incomplete.
- Query authentication now supports JWT tokens in
HYDROLIX_TOKEN
cookies, enabling SSO support. - A configuration synchronization issue has been fixed, preventing unpredictable results when a project or table is added or deleted repetitively with the same name.
- The SQL
-
Data
- The partition cleaner now considers the
PARTITION_GRACE_PERIOD
at startup to ensure partitions created within the last day are not deleted. - Zero-length uploads to Google Cloud Storage no longer result in hung uploads. Instead, it's logged as a failed attempt and then a new attempt is made.
- The cloud storage I/O library has had preparatory work for supporting batch delete jobs during periodic vacuums.
- The partition cleaner now considers the
-
UI
- The Analyze Tables UI now works for non-summary tables.
- When creating or editing a batch job or bucket, the UI no longer requires an endpoint.
- Roles may now be edited, no matter what policies or permissions are assigned them.
- The Query statistics tab now works again, reading data from the API's
x-hdx-query-stats
header rather than from the response body. - To improve security, dynamically-generated regular expressions are now created according to pre-defined patterns. Only known confirmation words (DELETE, TRUNCATE, FORCE) are now used for validation.
- The Axios interceptor now correctly extracts the path from the URL regardless of the domain, fixing an error in the dashboard section of the UI.