Query Metrics
The Hydrolix stack includes Prometheus, an open-source metrics database. While the Stack runs, Hydrolix continuously updates its Prometheus instance with metrics information. You can query, view, and actively monitor this information through the use of a stack's Grafana Integration or you can access it via your own monitoring platform.
Using Prometheus directly
Prometheus has its own web-based UI, available by visiting https://<HOST>/prometheus
in your web browser.
This view is suitable for quickly entering queries and seeing simple, graphed results. Hydrolix does make this feature available immediately, without any additional setup.
Hydrolix's metrics
This table lists the metrics available, and which components update them. If more than one component uses a given metric, then querying it will return results from all relevant components. You can restrict results to a specific component by adding a service
keyword to your query, e.g. "process_open_fds{service="stream-peer"}
".
For more information about metric types, refer to Prometheus's documentation.
General metrics
These metrics track various counters and statistics regarding data ingestion.
Metric | Type | Components | Purpose |
---|---|---|---|
bytes_written | Counter | Batch peer, Stream peer | Bytes written to the indexer. |
partitions_created | Counter | Batch peer, Stream peer | Count of partitions created. |
process_cpu_seconds_total | Counter | Batch peer, Stream head, Stream peer | Total user and system CPU time spent in seconds. |
process_max_fds | Gauge | Batch peer, Stream head, Stream peer | Maximum number of open file descriptors. |
process_open_fds | Gauge | Batch peer, Stream head, Stream peer | Number of open file descriptors. |
process_resident_memory_bytes | Gauge | Batch peer, Stream head, Stream peer | Resident memory size in bytes. |
process_start_time_seconds | Gauge | Batch peer, Stream head, Stream peer | Start time of the process since unix epoch in seconds. |
process_virtual_memory_bytes | Gauge | Batch peer, Stream head, Stream peer | Virtual memory size in bytes. |
process_virtual_memory_max_bytes | Gauge | Batch peer, Stream head, Stream peer | Maximum amount of virtual memory available in bytes. |
promhttp_metric_handler_requests_in_flight | Gauge | Batch peer, Stream head, Stream peer | Current number of scrapes being served. |
promhttp_metric_handler_requests_total | Counter | Batch peer, Stream head, Stream peer | Total number of scrapes by HTTP status code. |
Query metrics
These metrics track activity specific to batch ingestions.
Metric | Type | Components | Purpose |
---|---|---|---|
net_connect_attempts_total | Histogram | Head/Query peer | Histogram of TCP connections attempted to access the storage service (GCS/S3 etc). |
net_connect_seconds | Histogram | Head/Query peer | Histogram of time to connect over TCP to the storage service in seconds (GCS/S3 etc). |
net_dns_resolve_seconds | Histogram | Head/Query peer | Histogram of the DNS resolution time for the storage service in seconds (GCS/S3 etc). |
net_http_response_time | Histogram | Head/Query peer | Histogram of the HTTP response time from the storage service in seconds (GCS/S3 etc). |
net_http_response_bytes | Histogram | Head/Query peer | Histogram of HTTP bytes downloaded from the storage service (GCS/S3 etc). |
net_http_attempts_total | Histogram | Head/Query peer | Histogram of HTTP connection attempted to storage service |
net_http_status_code | Histogram | Head/Query peer | Histogram of HTTP status code result from storage service |
vfs_cache_hitmiss_total | Histogram | Head/Query peer | Histogram of cache status if bucket = 0 cache miss, and 1 cache hit |
vfs_cache_read_bytes | Histogram | Head/Query peer | Histogram bytes read from cache |
vfs_net_read_bytes | Histogram | Head/Query peer | Histogram bytes read from network |
vfs_cache_lru_file_eviction_total | Histogram | Head/Query peer | Histogram cache eviction of files |
epoll_cpu_seconds | Histogram | Head/Query peer | Histogram CPU used in seconds |
epoll_io_seconds | Histogram | Head/Query peer | Histogram I/O in seconds |
epoll_poll_seconds | Histogram | Head/Query peer | Histogram wait for file descriptor in seconds |
hdx_storage_r_catalog_partitions_total | Histogram | Head/Query peer | Histogram of per query catalog partition count |
hdx_storage_r_partitions_read_total | Histogram | Head/Query peer | Histogram of per query partition read count |
hdx_storage_r_partitions_per_core_total | Histogram | Head/Query peer | Histogram of per core partition used count |
hdx_storage_r_peers_used_total | Histogram | Query peer | Histogram of storage used total |
hdx_storage_r_cores_used_total | Histogram | Query peer | Histogram of Cores used total |
hdx_storage_r_catalog_timerange | Histogram | Head/Query peer | Histogram of query time range distribution |
hdx_partition_columns_read_total | Histogram | Head/Query peer | Histogram of column read |
hdx_partition_block_decode_seconds | Histogram | Head/Query peer | Histogram of time spent decoding hdx blocks in seconds |
hdx_partition_open_seconds | Histogram | Head/Query peer | Histogram of time spent opening hdx partition in seconds |
hdx_partition_read_seconds | Histogram | Head/Query peer | Histogram of time spent reading hdx partition in seconds |
hdx_partition_skipped_total | Histogram | Head/Query peer | Histogram of partition skip count due to no matching columns |
hdx_partition_blocks_read_total | Histogram | Head/Query peer | Histogram of partition read count |
hdx_partition_blocks_avail_total | Histogram | Head/Query peer | Histogram of partition blocks available |
hdx_partition_index_decision | Histogram | Head/Query peer | Histogram of partition decision if bucket = 0 fullscan, 1 partial scan and 2 no match |
hdx_partition_index_lookup_seconds | Histogram | Head/Query peer | Histogram of index lookup in seconds |
hdx_partition_index_blocks_skipped_percent | Histogram | Head/Query peer | Histogram of skipped index blocked in percentage |
hdx_partition_index_blocks_skipped_total | Histogram | Head/Query peer | Histogram of skipped index blocked in total |
hdx_partition_rd_w_err_total | Histogram | Head/Query peer | Histogram of errors if bucket = 0 read error, 1 written error and 3 error |
query_iowait_seconds | Histogram | Head/Query peer | Histogram query IO wait in seconds |
query_cpuwait_seconds | Histogram | Head/Query peer | Histogram query cpu wait in seconds |
query_hdx_ch_conv_seconds | Histogram | Head/Query peer | Histogram of time spent converting hdx blocks to clickhouse in seconds |
query_health | Histogram | Head/Query peer | Histogram of query health if bucket = 0 initiated error, 1 succeeded and 2 error |
query_peer_availability | Histogram | Head/Query peer | Histogram of query peer availability if bucket = 0 primary_peer_available, 1 secondary_peer_available and 2 no_reachable_peers |
query_attempts_total | Histogram | Head/Query peer | Histogram of query attempts total |
query_response_seconds | Histogram | Head/Query peer | Histogram of query response total in seconds |
query_rows_read_total | Histogram | Head/Query peer | Histogram of query rows read total |
query_read_bytes | Histogram | Head/Query peer | Histogram of query read bytes total |
query_rows_written_total | Histogram | Head/Query peer | Histogram of query rows written total |
Updated 9 months ago