Stream Metrics

The Hydrolix stack includes Prometheus, an open-source metrics database. Hydrolix outputs metrics to the included Prometheus instance.

UI

Prometheus provides a web-based UI that displays graphs of metrics. You can find the UI at https://<hostname>/prometheus.

This provides graphed results.

For more information about metric types, refer to the Prometheus documentation.

Metrics

If more than one component uses a given metric, querying it returns results from all relevant components. You can restrict results to a specific component by adding a service keyword to your query, e.g. process_open_fds{service="stream-peer"}.

Each of the Ingest method peers has multiple containers. One container performs message acquisition. The other container, known as the indexer, completes indexing and enrichment jobs.

HTTP Stream Ingest

These metrics are specific to the use of streaming data sources.

Traefik

The Traefik HTTP routing service produces the following metrics:

MetricTypeComponentsPurpose
traefik_service_requests_totalCounterTraefikHTTP Traefik request information.
traefik_service_request_duration_seconds_count/sum/bucketCounterTraefikResponse time of traefik to client.
http_source_request_duration_ns_count/sum/bucketCounterTraefikResponse time from Stream-Head.

Intake-Head

The Intake-Heads are an all-in-one replacement for the older Stream-Head/Stream-Peer architecture.

MetricTypeComponentsPurpose
hdx_sink_backlog_bytes_countGaugeIntake headTotal bytes of all partition buckets in sink backlog waiting to be indexed. Only produced when intake_head_index_backlog_enabled is true.
hdx_sink_backlog_items_countGaugeIntake headTotal count of partition buckets in sink backlog waiting to be indexed. Only produced when intake_head_index_backlog_enabled is true.
hdx_sink_backlog_dropped_bytes_countCounterIntake headTotal bytes of partition buckets dropped due to backlog growing too big. Only produced when intake_head_index_backlog_enabled is true.
hdx_sink_backlog_dropped_items_countCounterIntake headCount of partition buckets dropped due to backlog growing too big. Only produced when intake_head_index_backlog_enabled is true.
hdx_sink_backlog_delivery_countCounterIntake headCount of backlog buckets successfully handed off to indexing. Only produced when intake_head_index_backlog_enabled is true.
hdx_sink_backlog_trim_duration_nsHistogramIntake headTime to trim the backlog in nanoseconds. Only produced when intake_head_index_backlog_enabled is true.
http_source_outstanding_reqsGaugeIntake headNumber of outstanding ingest event requests.

Stream-Head

Stream-Heads, which coordinate streaming jobs, produce the following metrics:

MetricTypeComponentsPurpose
http_source_byte_countCounterStream headCount of bytes processed.
http_source_request_countCounterStream headCount of http requests.
http_source_request_duration_ns_count/bucket/sumHistogramStream headA histogram of HTTP request process duration in nanoseconds. Time is measured between last byte of the message received and placement of the message (or last part of the message) onto the queue.
http_source_request_error_countCounterStream headCount of http request failures.
http_source_row_countCounterStream headCount of rows processed.
http_source_value_countCounterStream headCount of values processed.

RedPanda

Internal RedPanda queues produce the following metrics:

MetricTypeComponentsPurpose
internal_event_queue_byte_count{mode="sink"}CounterStream HeadRow Count sent to RedPanda
internal_event_queue_row_count{mode="sink""}CounterStream HeadRow Count sent to RedPanda
internal_event_queue_value_count{mode="sink""}CounterStream HeadRow Count sent to RedPanda
internal_event_queue_row_count{mode="source"}CounterStream PeerRow Count received from RedPanda
internal_event_queue_value_count{mode="source"}CounterStream PeerValue Count recieved from RedPanda

Stream-Peer

Stream-Peers, which carry out streaming jobs, produce the following metrics:

MetricTypeComponentsPurpose
query_countCounterStream PeerCount of calls to the Catalog.
query_failureCounterStream PeerCount of failed Catalog calls.
query_latency_summaryCounterStream PeerLatency in calls to catalog.
query_latency_summary_count/sumCount/SumStream PeerLatency in calls to catalog.

Additionally, Stream ingest produces Stream-Peer Indexer metrics.

Kafka Ingest

Kafka data sources produce the following metrics:

MetricTypeComponentsPurpose
kafka_source_byte_countCounterKafka peerCount of bytes read from Kafka.
kafka_source_commit_duration_ns_count/bucket/sumHistogramKafka peerKafka commit duration.
kafka_source_read_countCounterKafka peerCount of Kafka reads.
kafka_source_read_duration_ns_count/bucket/sumHistogramKafka peerKafka read duration.
kafka_source_read_error_countCounterKafka peerCount of Kafka errors.
kafka_source_row_countCounterKafka peerCount of rows processed.
kafka_source_value_countCounterKafka peerCount of values processed.
query_countCounterKafka PeerCount of calls to the Catalog.
query_failureCounterKafka PeerCount of failed Catalog calls.
query_latency_summaryCounterKafka PeerLatency in calls to catalog.
query_latency_summary_count/sumCount/SumKafka PeerLatency in calls to catalog.

Additionally, Kafka ingest produces Kafka-Peer Indexer metrics.

Kinesis Ingest

Kinesis data sources produce the following metrics:

MetricTypeComponentsPurpose
kinesis_source_byte_countCounterKinesis peerCount of bytes read from Kinesis.
kinesis_source_checkpoint_countCounterKinesis peerCount of Kinesis checkpoint operations.
kinesis_source_checkpoint_duration_ns_count/bucket/sumHistogramKinesis peerDuration of Kinesis checkpoint operations.
kinesis_source_lag_msGaugeKinesis peerMeasure of lag in Kinesis source.
kinesis_source_operation_countCounterKinesis peerCount of operations on Kinesis.
kinesis_source_operation_duration_ns_count/bucket/sumHistogramKinesis peerHistogram of duration of operations on Kinesis.
kinesis_source_record_countCounterKinesis peerCount of records read from Kinesis.
kinesis_source_row_countCounterKinesis peerCount of rows read from Kinesis.
kinesis_source_value_countCounterKinesis peerCount of values read from Kinesis.
query_countCounterKinesis PeerCount of calls to the Catalog.
query_failureCounterKinesis PeerCount of failed Catalog calls.
query_latency_summaryCounterKinesis PeerLatency in calls to catalog.
query_latency_summary_count/sumCount/SumKinesis PeerLatency in calls to catalog.

Additionally, Kinesis ingest produces Kinesis-Peer Indexer metrics.

Indexer Metrics

Indexer metrics cover the indexing and enrichment of data being ingested. All of the above peer components produce these metrics. The indexer produces the following metrics:

MetricTypeComponentsPurpose
indexer_rows_written_count/bucket/sumHistogramBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPTotal rows indexed (written to partitions)
indexer_bytes_written_count/bucket/sumHistogramBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPTotal bytes indexed (written to partitions)
indexer_partitions_rejected_count/bucket/sumHistogramBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPHistogram of partitions not able to written. If value is 0 - query_parse_error 1 - network_error, 2 - internal_system_error, 3 - schema_mismatch, 4 - partition_files_write_failed, 5 - block_conversion_or_insertion_failed, 6 - other_internal_error
indexer_partitions_written_count/bucket/sumHistogramBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPTotal partitions created
indexer_partition_write_seconds_count/bucket/sumHistogramBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPCount of rows processed by the indexer and uploaded to storage. Includes Hot and Cold reporting.
hdx_indexer_req_duration_nsHistogramIntake HeadA histogram of durations of requests to indexer
hdx_indexer_req_errorsCounterIntake HeadCount of errors for requests to indexer service
hdx_sink_bucket_maint_duration_nsSummaryIntake HeadSummary of the bucket maintenance loop execution time.
hdx_sink_bucket_seal_filesHistogramIntake HeadA histogram of the number of files in buckets when sealed.
hdx_sink_byte_countCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPCount of bytes processed by the indexer and uploaded to storage. Includes Hot and Cold reporting.
hdx_sink_row_countCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPCount of rows processed by the indexer and uploaded to storage. Includes Hot and Cold reporting.
hdx_sink_value_countCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPCount of values processed by the indexer and uploaded to storage. Includes Hot and Cold reporting.
hdx_sink_error_countCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTPCount of errors in indexing and uploading to storage.
hdx_sink_open_bucket_slotsCounterIntake HeadMeasure of open bucket slots in the sink.
hdx_sink_partition_rows_summarySummaryIntake HeadSummary of number of rows in created partitions.
hdx_upload_obj_store_duration_nsSummaryIntake HeadSummary of durations for the uploading partition index files to object storage in nanoseconds.
hdx_upload_obj_store_errorsCounterIntake HeadCount of errors uploading partition index files to storage.
hdx_upload_process_write_result_duration_nsSummaryIntake HeadSummary of durations for processing of results in nanoseconds.

Storage

Cloud/Object Storage metrics.

Each of the below object_store* metrics has these labels:

  • provider - Object storage provider (AWS, Azure, GCS)
  • code - HTTP Response code
  • method - HTTP Method used (POST, GET, etc)
  • host - HTTP Host used to target object storage.
MetricTypeComponentsPurpose
net_http_status_code_bucketCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadHTTP Status Code histogram count from Storage.
object_store_http_histoHistogramBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadA histogram of object storage interaction latencies
object_store_http_summarySummaryBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadA summary of object storage interaction latencies
object_store_http_status_code_countCountBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadA count of successful HTTP requests against object storage (replaces net_http_status_code_count). Requests resulting in 500 are still considered successful.
object_store_http_error_countCountBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadA count of HTTP request errors (timeouts, connection errors, etc.)
object_store_http_bytes_txCountBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadA count of bytes transmitted to object storage (request body only)
object_store_http_bytes_rxCountBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, Intake HeadA count of bytes received from object storage (response body only)

Additional Metrics

The following metrics track the management and control of different components.

General Metrics.

The following metrics track various counters and statistics during data ingestion:

MetricTypeComponentsPurpose
upCounterAll componentsHow many pods are up for each component
process_cpu_seconds_totalCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikTotal user and system CPU time spent in seconds.
process_max_fdsGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikMaximum number of open file descriptors.
process_open_fdsGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of open file descriptors.
process_resident_memory_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikResident memory size in bytes.
process_start_time_secondsGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikStart time of the process since unix epoch in seconds.
process_virtual_memory_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikVirtual memory size in bytes.
process_virtual_memory_max_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikMaximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flightGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikCurrent number of scrapes being served.
promhttp_metric_handler_requests_totalCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikTotal number of scrapes by HTTP status code.

Go Environment Metrics

The following metrics track resources used by Hydrolix's Go environment:

MetricTypeComponentsPurpose
go_gc_duration_secondsSummaryBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikA summary of the pause duration of garbage collection cycles.
go_goroutinesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of goroutines that currently exist.
go_infoGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikInformation about the Go environment.
go_memstats_alloc_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes allocated and still in use.
go_memstats_alloc_bytes_totalCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikTotal number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes used by the profiling bucket hash table.
go_memstats_frees_totalCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikTotal number of frees.
go_memstats_gc_cpu_fractionGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikThe fraction of this program's available CPU time used by the GC since the program started.
go_memstats_gc_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of heap bytes allocated and still in use.
go_memstats_heap_idle_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of heap bytes waiting to be used.
go_memstats_heap_inuse_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of heap bytes that are in use.
go_memstats_heap_objectsGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of allocated objects.
go_memstats_heap_released_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of heap bytes released to OS.
go_memstats_heap_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of heap bytes obtained from system.
go_memstats_last_gc_time_secondsGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of seconds since 1970 of last garbage collection.
go_memstats_lookups_totalCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikTotal number of pointer lookups.
go_memstats_mallocs_totalCounterBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikTotal number of mallocs.
go_memstats_mcache_inuse_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes in use by mcache structures.
go_memstats_mcache_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes in use by mspan structures.
go_memstats_mspan_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes used for other system allocations.
go_memstats_stack_inuse_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes in use by the stack allocator.
go_memstats_stack_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes obtained from system for stack allocator.
go_memstats_sys_bytesGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of bytes obtained from system.
go_threadsGaugeBatch (inc. Autoingest), Kafka, Kinesis, Stream HTTP, TraefikNumber of OS threads created.