Query Performance Debugging

The HTTP Query API can respond with performance details. Set the query option hdx_query_debug=true:

Add hdx_query_debug=true to a SQL SETTINGS clause
Add query parameter hdx_query_debug=true to a request
Add an HTTP header X-HDX-query-settings: hdx_query_debug=true to a request

Responses include an HTTP Header X-Hdx-Query-Stats.

SELECT payment_type, count(pickup_datetime) as count
FROM sample.taxi_trips
WHERE pickup_datetime > '2008-01-01'
AND cab_type = 'yellow'
AND payment_type IN ('1', '2', '3', '4', '5', 'Cash', 'cre', 'CRE', 'No ', 'CRD', 'CSH', 'Dis', 'Dispute', 'Cas', 'DIS', 'NOC', 'CASH','No Charge','UNK','CREDIT')
AND trip_id IN (31190610, 32217069, 37860364, 35766523, 28006587, 211374605, 204381276, 210330995, 344756899, 353014930, 348959937, 520894156, 521592271, 521416965, 704689382, 707758682, 702269037, 883426527, 879781110, 872153954, 1048852862, 18432388, 1039420067, 1042644877, 17314313)
GROUP BY payment_type
ORDER BY count
SETTINGS hdx_query_debug=true

As the SQL statement include hdx_query_debug=true the response include the HTTP header X-HDX-Query-Stats:

X-HDX-Query-Stats: exec_time=732 rows_read=172032 bytes_read=7220125 num_partitions=100 num_peers=10 query_attempts=1 memory_usage=31468292 index_stats=[{"sample.taxi_trips":{"columns_read":["cab_type","payment_type","pickup_datetime","trip_id"],"indexes_used":["cab_type","payment_type","pickup_datetime","trip_id"],"shard_key_values_used":[]}}]

Main `hdx_query_debug` fields

Key / Field Name	Description
`exec_time`	In-cluster query execution time in milliseconds. Does not include transfer time to end user
`rows_read`	Number of rows read by the query peers
`bytes_read`	Number of bytes read by the query peers
`num_partitions`	Number of partitions read by the query peers
`num_peers`	Number of query peers used for the query
`query_attempts`	Number of query attempts, does not exceed `hdx_max_query_attempts`
`memory_usage`	Memory used, in bytes, by the query head
`pool_name`	Name of the query pool used
`limit_optimization`	Method of optimization used by the query engine for early query termination. Possible values: `where_order_by_limit_n`, `order_by_limit_n`, `limit_n`, `not_applied`
`query_detail_runtime_stats`	JSON object containing details about query execution Details below
`index_stats`	JSON object containing details about indexing for each table referenced. Details below

Fields in `query_detail_runtime_stats`

Field Name	Description
`cached_col_dict_bytes_read`	Total dictionary bytes read from local query peer cache, rather than fetched from object storage
`cached_index_bytes_read`	Total index bytes read from local query peer cache, rather than fetched from object storage
`cached_manifest_bytes_read`	Total manifest bytes read from local query peer cache, rather than fetched from object storage
`catalog_resp_time_ms`	Execution time of catalog query in milliseconds
`hdx_bad_partitions_skipped`	Count of partitions skipped due to data corruption
`hdx_blocks_read`	Count of data blocks fetched from object storage
`hdx_blocks_skipped`	Count of data blocks considered but not fetched due to optimization
`net_col_dict_bytes_read`	Total dictionary bytes read from object storage
`net_data_bytes_read`	Total data bytes read from object storage
`net_index_bytes_read`	Total index bytes read from object storage
`net_manifest_bytes_read`	Total manifest bytes read from object storage

Fields in `index_stats`

Field Name	Description
`columns_read`	Count of columns read
`distributed_join_or_in`	Boolean showing whether JOIN and IN statements were converted to GLOBAL JOIN and GLOBAL IN statements
`indexes_used`	Count of indexes used
`shard_key_values_used`	JSON list of shard key values

If you are writing complex query using JOIN or INNER SELECT statements, it's very useful to look at the index_stats to ensure that Hydrolix fetching only columns used by your statement.

🚧
X-HDX-Query-Stats
This header is present only when you are using the HTTP query API endpoint, don't use TCP clickhouse-native for debugging as the information won't be present.

📘
Stats included in logs
If you want to review those query statistics after the fact, those are present in the query logs

Query Performance Debugging

Main `hdx_query_debug` fields

Fields in `query_detail_runtime_stats`

Fields in `index_stats`

🚧
X-HDX-Query-Stats

📘
Stats included in logs

Main hdx_query_debug fields

Fields in query_detail_runtime_stats

Fields in index_stats

🚧X-HDX-Query-Stats

📘Stats included in logs

Main `hdx_query_debug` fields

Fields in `query_detail_runtime_stats`

Fields in `index_stats`

🚧
X-HDX-Query-Stats

📘
Stats included in logs