Query Metrics

The Hydrolix stack includes Prometheus, an open-source metrics database. While the Stack runs, Hydrolix continuously updates its Prometheus instance with metrics information. You can query, view, and actively monitor this information through the use of a stack's Grafana Integration or you can access it via your own monitoring platform.

Using Prometheus directly

Prometheus has its own web-based UI, available by visiting https://<HOST>/prometheus in your web browser.

This view is suitable for quickly entering queries and seeing simple, graphed results. Hydrolix does make this feature available immediately, without any additional setup.

Hydrolix's metrics

This table lists the metrics available, and which components update them. If more than one component uses a given metric, then querying it will return results from all relevant components. You can restrict results to a specific component by adding a service keyword to your query, e.g. "process_open_fds{service="stream-peer"}".

For more information about metric types, refer to Prometheus's documentation.

General metrics

These metrics track various counters and statistics regarding data ingestion.

MetricTypeComponentsPurpose
bytes_writtenCounterBatch peer, Stream peerBytes written to the indexer.
partitions_createdCounterBatch peer, Stream peerCount of partitions created.
process_cpu_seconds_totalCounterBatch peer, Stream head, Stream peerTotal user and system CPU time spent in seconds.
process_max_fdsGaugeBatch peer, Stream head, Stream peerMaximum number of open file descriptors.
process_open_fdsGaugeBatch peer, Stream head, Stream peerNumber of open file descriptors.
process_resident_memory_bytesGaugeBatch peer, Stream head, Stream peerResident memory size in bytes.
process_start_time_secondsGaugeBatch peer, Stream head, Stream peerStart time of the process since unix epoch in seconds.
process_virtual_memory_bytesGaugeBatch peer, Stream head, Stream peerVirtual memory size in bytes.
process_virtual_memory_max_bytesGaugeBatch peer, Stream head, Stream peerMaximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flightGaugeBatch peer, Stream head, Stream peerCurrent number of scrapes being served.
promhttp_metric_handler_requests_totalCounterBatch peer, Stream head, Stream peerTotal number of scrapes by HTTP status code.

Query metrics

These metrics track activity specific to querying.

MetricTypeComponentsPurpose
net_connect_attempts_totalHistogramHead/Query peerHistogram of TCP connections attempted to access the storage service (GCS/S3 etc).
net_connect_secondsHistogramHead/Query peerHistogram of time to connect over TCP to the storage service in seconds (GCS/S3 etc).
net_dns_resolve_secondsHistogramHead/Query peerHistogram of the DNS resolution time for the storage service in seconds (GCS/S3 etc).
net_http_response_timeHistogramHead/Query peerHistogram of the HTTP response time from the storage service in seconds (GCS/S3 etc).
net_http_response_bytesHistogramHead/Query peerHistogram of HTTP bytes downloaded from the storage service (GCS/S3 etc).
net_http_attempts_totalHistogramHead/Query peerHistogram of HTTP connection attempted to storage service
net_http_status_codeHistogramHead/Query peerHistogram of HTTP status code result from storage service
vfs_cache_hitmiss_totalHistogramHead/Query peerHistogram of cache status if bucket = 0 cache miss, and 1 cache hit
vfs_cache_read_bytesHistogramHead/Query peerHistogram bytes read from cache
vfs_net_read_bytesHistogramHead/Query peerHistogram bytes read from network
vfs_cache_lru_file_eviction_totalHistogramHead/Query peerHistogram cache eviction of files
epoll_cpu_secondsHistogramHead/Query peerHistogram CPU used in seconds
epoll_io_secondsHistogramHead/Query peerHistogram I/O in seconds
epoll_poll_secondsHistogramHead/Query peerHistogram wait for file descriptor in seconds
hdx_blocks_readCounterQuery peerPer-query measurement. Count of the number of blocks read from an object store for a given query.
hdx_blocks_skippedCounterQuery peerPer-query measurement. Count of the number of blocks skipped from an object store for a given query.
hdx_storage_r_catalog_partitions_totalHistogramHead/Query peerHistogram of per query catalog partition count
hdx_storage_r_partitions_read_totalHistogramHead/Query peerHistogram of per query partition read count
hdx_storage_r_partitions_per_core_totalHistogramHead/Query peerHistogram of per core partition used count
hdx_storage_r_peers_used_totalHistogramQuery peerHistogram of storage used total
hdx_storage_r_cores_used_totalHistogramQuery peerHistogram of Cores used total
hdx_storage_r_catalog_timerangeHistogramHead/Query peerHistogram of query time range distribution
hdx_partition_columns_read_totalHistogramHead/Query peerHistogram of column read
hdx_partition_block_decode_secondsHistogramHead/Query peerHistogram of time spent decoding hdx blocks in seconds
hdx_partition_open_secondsHistogramHead/Query peerHistogram of time spent opening hdx partition in seconds
hdx_partition_read_secondsHistogramHead/Query peerHistogram of time spent reading hdx partition in seconds
hdx_partition_skipped_totalHistogramHead/Query peerHistogram of partition skip count due to no matching columns
hdx_partition_blocks_read_totalHistogramHead/Query peerHistogram of partition read count
hdx_partition_blocks_avail_totalHistogramHead/Query peerHistogram of partition blocks available
hdx_partition_index_decisionHistogramHead/Query peerHistogram of partition decision if bucket = 0 fullscan, 1 partial scan and 2 no match
hdx_partition_index_lookup_secondsHistogramHead/Query peerHistogram of index lookup in seconds
hdx_partition_index_blocks_skipped_percentHistogramHead/Query peerHistogram of skipped index blocked in percentage
hdx_partition_index_blocks_skipped_totalHistogramHead/Query peerHistogram of skipped index blocked in total
hdx_partition_rd_w_err_totalHistogramHead/Query peerHistogram of errors if bucket = 0 read error, 1 written error and 3 error
query_iowait_secondsHistogramHead/Query peerHistogram query IO wait in seconds
query_cpuwait_secondsHistogramHead/Query peerHistogram query cpu wait in seconds
query_hdx_ch_conv_secondsHistogramHead/Query peerHistogram of time spent converting hdx blocks to clickhouse in seconds
query_healthHistogramHead/Query peerHistogram of query health if bucket = 0 initiated error, 1 succeeded and 2 error
query_peer_availabilityHistogramHead/Query peerHistogram of query peer availability if bucket = 0 primary_peer_available, 1 secondary_peer_available and 2 no_reachable_peers
query_attempts_totalHistogramHead/Query peerHistogram of query attempts total
query_response_secondsHistogramHead/Query peerHistogram of query response total in seconds
query_rows_read_totalHistogramHead/Query peerHistogram of query rows read total
query_read_bytesHistogramHead/Query peerHistogram of query read bytes total
query_rows_written_totalHistogramHead/Query peerHistogram of query rows written total