Advanced Query Options
Circuit breakers
Due to the nature of Hydrolix and the potential for queries to cover vast amounts of data, the following circuit breakers are provided to help ensure resources are consumed effectively.
The following options specify query limits, if the query goes above a specified limit it is cancelled automatically.
hdx_query_catalog_timeout_ms
hdx_query_catalog_timeout_ms
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_catalog_timeout_ms | 1 | n/a | unlimited |
The maximum amount of time, in milliseconds, a catalog query will run before timing out. If the setting hdx_query_max_execution_time
is set, the value for that setting will supersede this one.
hdx_query_max_rows
hdx_query_max_rows
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_rows | 0 | n/a | unlimited |
Specifies the maximum number of rows that can be evaluated to answer a query
hdx_query_max_attempts
hdx_query_max_attempts
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_attempts | 0 | n/a | unlimited |
Specifies the maximum number of failures that can occur
hdx_query_max_result_bytes
hdx_query_max_result_bytes
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_result_bytes | 10000 | n/a | unlimited |
The maximum amount of bytes that can be stored on the Query Head before returning data.
hdx_query_max_result_rows
hdx_query_max_result_rows
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_result_rows | 0 | n/a | unlimited |
The maximum number of rows that can be stored as a result of a query on the Query Head before returning the response.
hdx_query_max_timerange_sec
hdx_query_max_timerange_sec
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_timerange_sec | 0 | n/a | unlimited |
Specify the maximum timerange allowed in the query, for example 86400 will limit the timerange to 1 day. If you don't have any timerange filtering in your query this options will not work as we calculate the difference between timestamp in the WHERE clause.
hdx_query_timerange_required
hdx_query_timerange_required
Query Option | Possible values | Default Value |
---|---|---|
hdx_query_timerange_required | true / false | false |
Boolean to allow a query to run if there's no time range specified in the WHERE clause. Due to the size of data that can be stored in Hydrolix this setting protects against a query forcing a scan of the whole data set.
hdx_query_max_partitions
hdx_query_max_partitions
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_partitions | 1 | n/a | unlimited |
Number of partition a query is allowed to access. If that number is exceeded, the query will not be allowed to run as it'll read more partition than allowed.
hdx_query_max_execution_time
hdx_query_max_execution_time
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
max_execution_time | 0 | (Unbound) | 0 |
Number of seconds before a query is canceled. A value of 0 means that there is no limit. Any other value above 0 will cancel the query when the specified number of seconds is reached.
hdx_query_max_columns_to_read
hdx_query_max_columns_to_read
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
max_columns_to_read | 0 | (Unbound) | 0 |
Number of columns allowed in SELECT statement before a query is canceled. A value of 0 means that there is no limit. Any other value above 0 will cancel the query when the specified number of columns is reached.
hdx_query_max_memory_usage
hdx_query_max_memory_usage
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_memory_usage | 0 | (Unbound) | 0 |
Max memory in bytes for a single query, if the query is exceeding the max_memory_usage it'll be cancel.
This settings is per query per query-peer and head.
Rate limiting
These options specify limits on the resources available to query processes.
hdx_query_max_peers
hdx_query_max_peers
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_peers | 1 | (Unbound) | null (all peers) |
By default, Hydrolix distributes query processing across all available query peers in order to maximize massively parallel processing. Setting this flag instructs the query head to instead use only a subset of available peers. If a number greater than the number of available peers is given, all available peers are used.
hdx_query_pool_name
hdx_query_pool_name
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_pool_name | n/a | n/a | "" (empty string) |
A string with a pool name can be used to instruct Hydrolix where a query should run. Given a set of pools, using the name of a given pool will run the query only in peers belonging to the pool chosen. If the parameter is not set, the query will run in all available peers from all pools by default.
hdx_query_max_streams
hdx_query_max_streams
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_streams | 1 | Twice the count of available CPU cores | null (1 per core) |
By default, each query peer will run one process per CPU core. To limit the number of processes a query might run, set a value here.
hdx_query_max_concurrent_partitions
hdx_query_max_concurrent_partitions
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_concurrent_partitions | 1 | n/a | 3 |
Hydrolix query processing generally requires each query process to extract data from many HDX partitions. This flag sets a limit on the number of partitions which a query peer reads from at the same time.
Note that this setting is applied per-process. For example, if four-core query peer runs four processes, and each of these opens up to 25 partitions, then each query peer may have as many as 100 partitions open at once.
Decreasing this setting from its default setting of 3 may slow down query performance. Increasing this setting beyond 25 risks excessive memory pressure on the peer. Tread carefully.
Other flags
hdx_summary_override_indexes
hdx_summary_override_indexes
A summary table indexes all non-aggregate columns by default.
Add this line to the query to exclude columns from indexing:
hdx_summary_override_indexes = 'column_1,column_2,column_3'
In this example, the cab_type
and trip_type
columns are not indexed.
SELECT
timestamp,
trip_type,
cab_type
FROM
sample.taxi.trips
GROUP BY
timestamp,
trip_type,
cab_type FORMAT HDX SETTINGS hdx_primary_key = 'timestamp',
hdx_summary_override_indexes = 'trip_type,cab_type'
use_query_cache
use_query_cache
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
use_query_cache | n/a | false | true |
Only useable in the SETTINGS
clause of an SQL query, this marks the query as a candidate for using ClickHouse query cacheing. See "Use Query Caching" on the Query Efficiency page.
hdx_query_distributed_aggregation_memory_efficient
hdx_query_distributed_aggregation_memory_efficient
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_distributed_aggregation_memory_efficient | 0 | n/a | unlimited |
Relates to https://github.com/ClickHouse/ClickHouse/pull/20599
hdx_query_max_bytes_before_external_group_by
hdx_query_max_bytes_before_external_group_by
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_bytes_before_external_group_by | 0 | n/a | unlimited |
The maximum amount of bytes that can be used in memory before data is spilt to disk when applying group bys. These can be used to help protect out of memory errors, however disk will be utilized in replacement. Use with care.
hdx_query_max_bytes_before_external_sort
hdx_query_max_bytes_before_external_sort
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_max_bytes_before_external_sort | 0 | n/a | unlimited |
The maximum amount of bytes that can be used in memory before data is spilt to disk when applying sort bys. These can be used to help protect out of memory errors, however disk will be utilized in replacement. Use with care.
hdx_query_unlimited_cnf
hdx_query_unlimited_cnf
Query Option | Min Value | Max Value | Default Value |
---|---|---|---|
hdx_query_unlimited_cnf | 0 | 1 | 0 |
When set to 1
, this disables limits on the number of clauses when converting the query to conjunctive normal form (CNF). See the ClickHouse documentation for more information. Note that disabling this cap on CNFs will likely cause the query to be very slow and potentially use much more memory.
Updated 5 days ago