Metrics for Observability

Use metrics to enhance observability and build dashboards

Overview

Observability can mean different things for different purposes. This guide provides some key metrics to look for when building dashboards, and suggestions to avoid alert fatigue. This guide assumes the use of Kubernetes and Prometheus, and visualization with Grafana dashboards.

Metrics types

There are a number of different metric types used in different areas of Hydrolix.

Node-level metrics

CPU usage
Memory usage
Disk usage
Network traffic I/O

Pod and container metrics

CPU and memory usage per pod or container
Pod status and restarts
Resource requests and limits

Control plane metrics

Scheduler performance
Server latency
Hydrolix-specific metrics

Grafana dashboards

Hydrolix uses Prometheus for metrics and Kubernetes for logs, with visualization through Grafana dashboards. Grafana dashboards are flexible and can be filtered to prevent over-alerting while still showing issues as they happen.

See how to Visualize Hydrolix Data in Grafana.

To see a list of metrics used by Hydrolix, see All Metrics.

Updated 2 months ago

What’s Next

Visualize Hydrolix Data in Grafana

Visualize Hydrolix Data in Grafana