Summary Metrics
The Hydrolix stack includes Prometheus, an open-source metrics database. While the Stack runs, Hydrolix continuously updates its Prometheus instance with metrics information.
Use Prometheus directly
Prometheus has its own web-based UI, available by visiting https://<YourHostname>/prometheus
in your web browser.
This view is a basic metric view, suitable for quickly entering queries and seeing simple, graphed results. Hydrolix does make this feature available immediately, without any additional setup.
For more information about metric types, refer to Prometheus's documentation.
Hydrolix's metrics
If more than one component uses a given metric, then querying it will return results from all relevant components. You can restrict results to a specific component by adding a service
keyword to your query, e.g. "process_open_fds{service="stream-peer"}
".
For more information about metric types, refer to Prometheus's documentation.
Each of the Ingest method peers
has multiple containers. One container that will do message acquisition and the other is the indexer
which will index and complete enrichment jobs.
HTTP Stream Summary Ingest
These metrics are specific to the use of streaming data sources.
Traefik
Metric | Type | Components | Purpose |
---|---|---|---|
traefik_service_requests_total | Counter | Traefik | HTTP Traefik request information. |
traefik_service_request_duration_seconds_count/sum/bucket | Counter | Traefik | Response time of traefik to client. |
http_source_request_duration_ns_count/sum/bucket | Counter | Traefik | Response time from Stream-Head. |
Stream Head and Intake Head
Metric | Type | Components | Purpose |
---|---|---|---|
http_source_byte_count | Counter | Stream head | Count of bytes processed. |
http_source_request_count | Counter | Stream head | Count of http requests. |
http_source_request_duration_ns_count/bucket/sum | Histogram | Stream head | A histogram of HTTP request durations in nanoseconds. |
http_source_request_error_count | Counter | Stream head | Count of http request failures. |
http_source_row_count | Counter | Stream head | Count of rows processed. |
http_source_value_count | Counter | Stream head | Count of values processed. |
Redpanda
Metric | Type | Components | Purpose |
---|---|---|---|
internal_event_queue_byte_count{mode="sink"} | Counter | Stream Head | Byte Count sent to Redpanda |
internal_event_queue_row_count{mode="sink""} | Counter | Stream Head | Row Count sent to Redpanda |
internal_event_queue_byte_count{mode="sink""} | Counter | Stream Head | Byte Count sent to Redpanda |
internal_event_queue_row_count{mode="source"} | Counter | Stream Summary | Row Count received from Redpanda |
internal_event_queue_value_count{mode="source"} | Counter | Stream Summary | Value Count recieved from Redpanda |
Stream Summary Metrics
Metric | Type | Components | Purpose |
---|---|---|---|
query_count | Counter | Summary Peer | Count of calls to the Catalog. |
query_failure | Counter | Summary Peer | Count of failed Catalog calls. |
query_latency_summary | Counter | Summary Peer | Latency of calls to catalog. |
query_latency_summary_count/sum | Count/Sum | Summary Peer | Latency of calls to catalog. |
Indexer Metrics
Indexer metrics are metrics that cover the indexing and enrichment of data being ingested
Metric | Type | Components | Purpose |
---|---|---|---|
indexer_rows_written_count/bucket/sum | Histogram | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Total rows indexed (written to partitions) |
indexer_bytes_written_count/bucket/sum | Histogram | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Total bytes indexed (written to partitions) |
indexer_partitions_rejected_count/bucket/sum | Histogram | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Histogram of partitions not able to written. If value is 0=raw data parsing failed, 1=raw data / transform schema mismatch, 3=Error writing partition file, 4= Other Error during indexing |
indexer_partitions_written_count/bucket/sum | Histogram | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Total partitions created |
indexer_partition_write_seconds_count/bucket/sum | Histogram | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Time from receiving indexing query to writing partition file (seconds) |
hdx_sink_row_count | Counter | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Count of rows processed by the indexer and uploaded to storage. Includes Hot and Cold reporting. |
hdx_sink_byte_count | Counter | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Count of bytes processed by the indexer and uploaded to storage. Includes Hot and Cold reporting. |
hdx_sink_value_count | Counter | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Count of values processed by the indexer and uploaded to storage. Includes Hot and Cold reporting. |
hdx_sink_error_count | Counter | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | Count of errors in indexing and uploading to storage. |
Storage
Cloud/Object Storage metrics.
Each of the below object_store*
metrics has these labels:
- provider - Object storage provider (AWS, Azure, GCS)
- code - HTTP Response code
- method - HTTP Method used (POST, GET, etc)
- host - HTTP Host used to target object storage.
Metric | Type | Components | Purpose |
---|---|---|---|
net_http_status_code_bucket | Counter | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | HTTP Status Code histogram count from Storage. |
object_store_http_histo | Histogram | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | A histogram of object storage interaction latencies |
object_store_http_summary | Summary | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | A summary of object storage interaction latencies |
object_store_http_status_code_count | Count | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | A count of successful HTTP requests against object storage (replaces net_http_status_code_count). Requests resulting in 500 are still considered successful. |
object_store_http_error_count | Count | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | A count of HTTP request errors (timeouts, connection errors, etc.) |
object_store_http_bytes_tx | Count | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | A count of bytes transmitted to object storage (request body only) |
object_store_http_bytes_rx | Count | Batch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake Head | A count of bytes received from object storage (response body only) |
Additional Metrics
Additional Metrics are provided for the management and control of different components.
General Metrics.
These metrics track various counters and statistics regarding data ingestion.
Metric | Type | Purpose |
---|---|---|
process_cpu_seconds_total | Counter | Total user and system CPU time spent in seconds. |
process_max_fds | Gauge | Maximum number of open file descriptors. |
process_open_fds | Gauge | Number of open file descriptors. |
process_resident_memory_bytes | Gauge | Resident memory size in bytes. |
process_start_time_seconds | Gauge | Start time of the process since unix epoch in seconds. |
process_virtual_memory_bytes | Gauge | Virtual memory size in bytes. |
process_virtual_memory_max_bytes | Gauge | Maximum amount of virtual memory available in bytes. |
promhttp_metric_handler_requests_in_flight | Gauge | Current number of scrapes being served. |
promhttp_metric_handler_requests_total | Counter | Total number of scrapes by HTTP status code. |
Go environment metrics
These metrics track resources used by Hydrolix's Go environments.
Metric | Type | Purpose | |
---|---|---|---|
go_gc_duration_seconds | Summary | A summary of the pause duration of garbage collection cycles. | |
go_goroutines | Gauge | Number of goroutines that currently exist. | |
go_info | Gauge | Information about the Go environment. | |
go_memstats_alloc_bytes | Gauge | Number of bytes allocated and still in use. | |
go_memstats_alloc_bytes_total | Counter | Total number of bytes allocated, even if freed. | |
go_memstats_buck_hash_sys_bytes | Gauge | Number of bytes used by the profiling bucket hash table. | |
go_memstats_frees_total | Counter | Total number of frees. | |
go_memstats_gc_cpu_fraction | Gauge | The fraction of this program's available CPU time used by the GC since the program started. | |
go_memstats_gc_sys_bytes | Gauge | Number of bytes used for garbage collection system metadata. | |
go_memstats_heap_alloc_bytes | Gauge | Number of heap bytes allocated and still in use. | |
go_memstats_heap_idle_bytes | Gauge | Number of heap bytes waiting to be used. | |
go_memstats_heap_inuse_bytes | Gauge | Number of heap bytes that are in use. | |
go_memstats_heap_objects | Gauge | Number of allocated objects. | |
go_memstats_heap_released_bytes | Gauge | Number of heap bytes released to OS. | |
go_memstats_heap_sys_bytes | Gauge | Number of heap bytes obtained from system. | |
go_memstats_last_gc_time_seconds | Gauge | Number of seconds since 1970 of last garbage collection. | |
go_memstats_lookups_total | Counter | Total number of pointer lookups. | |
go_memstats_mallocs_total | Counter | Total number of mallocs. | |
go_memstats_mcache_inuse_bytes | Gauge | Number of bytes in use by mcache structures. | |
go_memstats_mcache_sys_bytes | Gauge | Number of bytes used for mcache structures obtained from system. | |
go_memstats_mspan_inuse_bytes | Gauge | Number of bytes in use by mspan structures. | |
go_memstats_mspan_sys_bytes | Gauge | Number of bytes used for mspan structures obtained from system. | |
go_memstats_next_gc_bytes | Gauge | Number of heap bytes when next garbage collection will take place. | |
go_memstats_other_sys_bytes | Gauge | Number of bytes used for other system allocations. | |
go_memstats_stack_inuse_bytes | Gauge | Number of bytes in use by the stack allocator. | |
go_memstats_stack_sys_bytes | Gauge | Number of bytes obtained from system for stack allocator. | |
go_memstats_sys_bytes | Gauge | Number of bytes obtained from system. | |
go_threads | Gauge | Number of OS threads created. |
Updated 5 months ago