Query Options

Hydrolix allows you to control the behaviour of the service and modify it at a query level

How to use query options

Parameters can be set via the 3 different mechanism:

  • Directly in SQL via SETTINGS
  • In the query parameters of the HTTP request.
  • Via HTTP Headers send to the query endpoint.

Parameters that are included in the SQL statement have priority over query string which itself have priority over parameters set in the headers.

Setting query options via SQL SETTINGS

You can add a special statement in your SQL query to specify Hydrolix parameters.
This statement is SETTINGS, it has to be included at the end of your query.
You can have multiple Hydrolix settings by separating each one by a ,.

WHERE timestamp >= toDateTime(1636289714) 
AND timestamp <= toDateTime(1636376114) 
AND arrayJoin(data.leaf_cert.all_domains) LIKE '%hydrolix.live%' 
SETTINGS hdx_query_output_file_enabled='true', hdx_query_admin_comment='User: David Sztykman'

In this example we are writing the results of the query into S3 and we also provide a comment, that the user generating this query is David Sztykman.


Settings works only with SELECT

Using the settings in SQL works only for SELECT query, INSERT is not compatible with custom settings

Setting query options via query parameters

You can modify the behaviour of Hydrolix's query engine by attaching one or more optional parameters to your query API requests. These options simply take the form of additional HTTP parameters alongside the required query parameter.

Setting query options via HTTP headers

Additionally, you can set options in the HTTP headers via X-HDX-query-settings header. The header receives a comma-separated set of key=values. Note that you should not add a space after each comma separator.

The header, as required by HTTP protocol, can appear several times. If an option is repeated, the last value in the last header overrides the previous value.

Query example

In the example below, hdx_query_pool_name will be set to the value somepool and hdx_query_max_streamsto 12.

GET <YOUR-HYDROLIX-HOST>/?query=....&hdx_query_pool_name=somepool
X-HDX-query-settings: hdx_query_pool_name=mypool,hdx_query_max_streams=10
X-HDX-query-settings: hdx_query_max_streams=12

Available options

Output Formatting

While Hydrolix's query API returns results as JSON by default, it also supports every output format recognized by its underlying Clickhouse engine.

The specify the output format for a query, add a hdx_query_output_format parameter to your API request, setting that parameter's value to any of Clickhouse's supported output formats.

For example, to have the response to a query API GET request formatted as CSV:


Advanced options

The remainder of the API options described by this page set various fine-tuning attributes on how Hydrolix processes a given query.

In most cases, you won't need to change any of these settings from their default values. Hydrolix's query engine is already optimized to work with the resource-allocation and caching settings already represented by these defaults.

If you have any questions about improving your queries' performance, please contact Hydrolix support.

Circuit Breaker

These options specify limits allowed to query, if the query goes above those limits they aren't run and are canceled automatically.


Query OptionMin ValueMax ValueDefault Value

Specify the maximum timerange allowed in the query, for example 86400 will limit the timerange to 1 day. If you don't have any timerange filtering in your query this options will not work as we calculate the difference between timestamp in the WHERE clause.


Query OptionPossible valuesDefault Value
hdx_query_timerange_requiredtrue / falsefalse

Boolean to allow a query to run if there's no time range specified in the WHERE clause. Due to the size of data that can be stored in Hydrolix this setting protects against a query forcing a scan of the whole data set.


Query OptionMin ValueMax ValueDefault Value

Number of partition a query is allowed to access. If that number is exceeded, the query will not be allowed to run as it'll read more partition than allowed.


Query OptionMin ValueMax ValueDefault Value

Number of seconds before a query is canceled. A value of 0 means that there is no limit. Any other value above 0 will cancel the query when the specified number of seconds is reached.


Query OptionMin ValueMax ValueDefault Value

Number of columns allowed in SELECT statement before a query is canceled. A value of 0 means that there is no limit. Any other value above 0 will cancel the query when the specified number of columns is reached.


Query OptionMin ValueMax ValueDefault Value

Max memory in bytes for a single query, if the query is exceeding the max_memory_usage it'll be cancel.
This settings is per query per query-peer and head.

Rate Limiting

These options specify limits on the resources available to query processes.


Query OptionMin ValueMax ValueDefault Value
hdx_query_max_peers1(Unbound)null (all peers)

By default, Hydrolix distributes query processing across all available query peers in order to maximize massively parallel processing. Setting this flag instructs the query head to instead use only a subset of available peers. If a number greater than the number of available peers is given, all available peers are used.


Query OptionMin ValueMax ValueDefault Value
hdx_query_pool_namen/an/a"" (empty string)

A string with a pool name can be used to instruct Hydrolix where a query should run. Given a set of pools, using the name of a given pool will run the query only in peers belonging to the pool chosen. If the parameter is not set, the query will run in all available peers from all pools by default.


Query OptionMin ValueMax ValueDefault Value
hdx_query_max_streams1Twice the count of available CPU coresnull (1 per core)

By default, each query peer will run one process per CPU core. To limit the number of processes a query might run, set a value here.


Query OptionMin ValueMax ValueDefault Value

Hydrolix query processing generally requires each query process to extract data from many HDX partitions. This flag sets a limit on the number of partitions which a query peer reads from at the same time.

Note that this setting is applied per-process. For example, if four-core query peer runs four processes, and each of these opens up to 25 partitions, then each query peer may have as many as 100 partitions open at once.

Decreasing this setting from its default setting of 25 may slow down query performance. Increasing this setting beyond 25 risks excessive memory pressure on the peer. Tread carefully.

Other options

Other options that do not under the other categories.


Query OptionPossible valuesDefault Value
hdx_query_output_file_enabledtrue or falsefalse

Indicates whether you want to save a query result to a file on your cloud storage. The query is saved in the format instructed by hdx_query_output_format in a randomly generated filename.


Query OptionPossible valuesDefault Value

Indicates whether you want to save a query result to a file on your cloud storage. The query is saved in the format instructed by hdx_query_output_format in the filename specified, this will overwrite the file it's already present.


Query OptionPossible valuesDefault Value

Add a comment to the query which is stored in active query, this allows user to explain the query. For example if you have a query running every X min as part of a reporting tool you can include this information the comment of the query.


Query OptionPossible valuesDefault Value

Add an admin comment to the query which is stored in active query. This field is filled automatically by Superset or Grafana to include username information in order to track users activity.

Did this page help you?