30 August 2024 - v4.18.2
4 months ago by Katherine Maack
New API endpoints; a feature to support our upcoming observability product on AWS called “Cascade” and configurable worker pools for the ingest head services.
Notable New Features
- New API for Background Tasks
- Configuration changes are now published asynchronously, and there are 4 new endpoints for monitoring asynchronous tasks.
GET /tasks/
- list all scheduled tasksGET /tasks/:id
- get details on a scheduled task, including any related eventsGET /tasks/events
- list all eventsGET /tasks/events/:id
- get details on an event, including expanded job information
- Configuration changes are now published asynchronously, and there are 4 new endpoints for monitoring asynchronous tasks.
- New Kinesis Consumer
- To support our upcoming observability product on AWS called “Cascade,” an additional type of Kinesis source directly utilizes the AWS Kinesis Client Library to provide robust Kinesis ingest. Use of this new
kinesis-kcl-consumer
requires a manual switch-over from the older Kinesis ingest system.
- To support our upcoming observability product on AWS called “Cascade,” an additional type of Kinesis source directly utilizes the AWS Kinesis Client Library to provide robust Kinesis ingest. Use of this new
- Intake Heads and Stream Heads can now be assigned separate pools via the API
- Ingest heads (
intake-head
andstream-head
) are now poolable via the API, and users can create, get, update, patch, and delete pools for both services. A new GET endpoint lists the available ingest URLs:https://{hostname}/config/v1/pools/ingest_endpoints
- Ingest heads (
Upgrade on GKE
kubectl apply -f "https://www.hydrolix.io/operator/v4.18.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&gcp-storage-sa=${GCP_STORAGE_SA}"
Upgrade on EKS
kubectl apply -f "https://www.hydrolix.io/operator/v4.18.2/operator-resources?namespace=${HDX_KUBERNETES_NAMESPACE}&aws-storage-role=${AWS_STORAGE_ROLE}"
Upgrade on LKE
kubectl apply -f "<https://www.hydrolix.io/operator/v4.18.2/operator-resources?namespace=$HDX_KUBERNETES_NAMESPACE"
General
- API
- Added support for using the
force_operation
query parameter when deleting buckets. Added a new endpoint which lists resources currently using a given bucket. - Exposed
emailVerified
field to users - Intake Heads and Stream Heads can now be assigned separate pools via the API.
- Config changes are published asynchronously, improving API performance. Added endpoints under
/tasks
to allow monitoring the status of background tasks which are:GET /tasks/
- list all scheduled tasksGET /tasks/:id
- get details on a scheduled task, including any related eventsGET /tasks/events
- list all eventsGET /tasks/events/:id
- get details on an event, including expanded job information
- To avoid broken summary tables due to changes to transforms, automatically-generated summary transforms are now read-only. Modify a table’s summary SQL via the summary settings object in the table in the table patch API.
- Upgraded psycopg from version 2.9.9 to 3.2.1
- Batch job and auto ingest updates for cross-cloud support. We’ve updated the configured data source options for a batch job from a url to a json object containing the following:
Existing batch jobs created in versions prior to this change will continue to be returned from the config API in the format they were created with.{ "bucket_name": {string}, "bucket_path": {string} (default: "/"), "region": {string}, "endpoint": {string}, "cloud": {string}, "credential_id": {string-uuid} (nullable: true) }
- Added support for using the
- Control
- Reduced the size of
init-*
containers to reduce overall cluster load, especially when spinning up many peers.
- Reduced the size of
- Core
- Data
- To support our upcoming observability product on AWS called “Cascade,” an additional type of Kinesis source directly utilizes the AWS Kinesis Client Library to provide robust Kinesis ingest. Use of this new
kinesis-kcl-consumer
requires a manual switch-over from the older Kinesis ingest system. - Hydrolix now verifies the Turbine indexer is running before issuing indexer requests. This prevents request failures in the event of temporary unavailability of the indexer e.g. while restarting.
- To support our upcoming observability product on AWS called “Cascade,” an additional type of Kinesis source directly utilizes the AWS Kinesis Client Library to provide robust Kinesis ingest. Use of this new
- UI
- Added support for dictionary load levels in the UI
- Added row count, data volume, and cardinality to raw (non-summary) tables. Reorganized summary table analysis tab within the Data page.
- Improvements to the dictionary page.
- Pool form has been modified so the name is read-only when editing
- Batch job and auto ingest updates for cross-cloud support. We’ve expanded the configured data source options when adding a batch job for a table.
Bug Fixes
- API
- Fixed Transform updates that previously failed with a conflict when array- or map-type elements had the primary field added or removed on update
- Fixed erroneous limitations applied to the
scale_profile
field: allows this field to be used with any pool type, not justmerge-peer
; allows users to assign custom scale profiles to thesmall
,medium
, andlarge
merge pools; any pool running workloads for the merge-peer service type can be assigned tosmall
,medium
, orlarge
. Relaxed validation for pooling scale profiles, allowing names other thanI
,II
, andIII
. - Kinesis endpoint no longer responds to HTTP PATCH with 500 error
- Removed dictionary verification query during dictionary creation to avoid errors due to bad user query settings.
- Sample data submitted with a transform is now validated. This ensures it’s the same type as the transform handles, whether it’s CSV or JSON.
- The API now validates the
memory_coefficient
table setting. Non-numeric strings and negative numbers are disallowed. - Mismatches between Keycloak and internal user databases are gracefully handled, rather than returning 500 errors. Users with this kind of mismatch are marked for auditing.
- Ensured that all Keycloak users are present in the config API’s database.
- Fixed HTTP 500 errors when attempting to upload a catalog file but not actually sending it. The upload is validated and will return an HTTP 400 if there’s a problem.
- Fixed a bug in which a valid PATCH request to
https://{hostname}/config/v1/orgs/{org_id}/projects/{project_id}/tables/{id}/
would throw a 500 error - When attempting to delete an org, the API will now send an HTTP 405 “Method not allowed” rather than an HTTP 500.
- When creating dictionaries, if there’s a data type
array
, it must have elements inside. Also, the API now accepts said arrays with elements in dictionary definitions. - Fixed a bug causing summary tables to incorrectly retain their original source table as the specified value for the
parents
setting when updated to use a different source table. This change has made theparents
setting read-only.
- Core
- Fixed a bug returning erroneous results for
IS NULL
queries against non-nullable fields - Partial fix for ‘hydro’ project cache not removing files as expected, causing some pods to enter a failed state after their container storage limits are exceeded. To that end, added tunables for disk cache removal:
disk_cache_cull_start_perc
(Percentage of cache disk space used before starting to remove files),disk_cache_cull_stop_perc
(Percentage of cache disk space used before stopping removing files), anddisk_cache_redzone_start_perc
(Minimum percentage of cache disk space used to be considered as redzone) - Correctly return values for
query_id
andinitial_query_id
columns inhydro.logs
for Indexer and Alter queries - Queries no longer produce “Unknown HDX SETTING” errors when user
hdx_query_optimize_order_by_primary
- Fixed a bug resulting in Turbine server segfaults
- Fixed a bug returning erroneous results for
- Data
- This change enables users to decorate rows with the "basic" Scientia Mobile license. Previously, these decorators would only work with the "standard", and would error if a "basic" asset was specified in the transform configuration.
- UI
- Fixed bug preventing Audit Trail results from displaying when the result was non-paginated
- Audit Trails tab on Security page: change query parameters
date_min
tomin_date
anddate_max
tomax_date
- Removed
expected_tb_per_day
from the table and summary table UI - Fixed Alter Jobs to ensure the ‘commit’ option displays when appropriate. When an Alter job has a pending status, the 'commit' option is now available. When the Alter job has the status done, the 'commit' option is unavailable.
- Dashboard now shows correct data for Queries per Second chart.
- Fixed a bug in which lengthy transform names would intersect with their respective descriptions
- Added two arguments to
generate-stream
related to sequencing.--seqColumn <string>
which will designate a string-typed column for sequence numbers, and--seqPrefix <string>
which will prepend the indicated string to the sequence for concurrentgenerate-stream
runs. - Fixed a bug that caused bucket values to be overwritten when multiple column mappings were used
- Fixed a bug when deleting a column from a transform with multiple columns. Previously, the first column would be deleted regardless of the column selected for deletion.
- Fixed a bug which prevented users from being able to edit the bucket settings of a table in the case of a table created without the settings defined.