Query Performance Debugging
By setting up the option hdx_query_debug=true
in the query SETTINGS or in the request string in HTTP /query/?hdx_query_debug=true
, Hydrolix will respond with an HTTP Header X-Hdx-Query-Stats
with performance details.
You can see the following example:
SELECT payment_type, count(pickup_datetime) as count
FROM sample.taxi_trips
WHERE pickup_datetime > '2008-01-01'
AND cab_type = 'yellow'
AND payment_type IN ('1', '2', '3', '4', '5', 'Cash', 'cre', 'CRE', 'No ', 'CRD', 'CSH', 'Dis', 'Dispute', 'Cas', 'DIS', 'NOC', 'CASH','No Charge','UNK','CREDIT')
AND trip_id IN (31190610, 32217069, 37860364, 35766523, 28006587, 211374605, 204381276, 210330995, 344756899, 353014930, 348959937, 520894156, 521592271, 521416965, 704689382, 707758682, 702269037, 883426527, 879781110, 872153954, 1048852862, 18432388, 1039420067, 1042644877, 17314313)
GROUP BY payment_type
ORDER BY count
SETTINGS hdx_query_debug=true
As the SQL statement include hdx_query_debug=true
the response include the HTTP header X-HDX-Query-Stats
:
X-HDX-Query-Stats: exec_time=732 rows_read=172032 bytes_read=7220125 num_partitions=100 num_peers=10 query_attempts=1 memory_usage=31468292 index_stats=[{"sample.taxi_trips":{"columns_read":["cab_type","payment_type","pickup_datetime","trip_id"],"indexes_used":["cab_type","payment_type","pickup_datetime","trip_id"],"shard_key_values_used":[]}}]
key | definition |
---|---|
exec_time | Time to execute the query (in ms) and get the response to the query-head, doesn't include response time to end user. |
rows_read | Number of rows read by the different query-peer |
bytes_read | Number of bytes read by the different query-peer |
num_partitions | Number of partitions read by the different query-peer |
num_peers | Number of query peers used for that query |
query_attempts | Number of query attempts by default we retry up to 3x if we had some failures |
memory_usage | Memory used in bytes at the query-head level. For example GROUP BY is done at the head and requires RAM to do the aggregation. |
index_stats | Specify the table used for the query, and for each table which column is read and also which index is used. |
If you are writing complex query using JOIN or INNER SELECT statement it's very useful to look at the index_stats to ensure that Hydrolix fetching only columns used by your statement.
X-HDX-Query-Stats
This header is present only when you are using the HTTP query API endpoint, don't use TCP clickhouse-native for debugging as the information won't be present.
Stats included in logs
If you want to review those query statistics after the fact, those are present in the query logs
Updated about 2 months ago