You can combine Hydrolix's built-in Prometheus component with Grafana to monitor your Hydrolix cluster. Your cluster comes with a default alert dashboard that you can customize to meet your specific needs.
First, connect Grafana to your cluster's Prometheus metrics.
Navigate to the Hydrolix Monitoring folder. Open the Platform Alerts dashboard:
This pre-built dashboard contains charts, queries and alert condition thresholds. The Platform Alerts dashboard uses the following alerting component groups:
- Batch Ingest
- Kafka Ingest
- Streaming Ingest
- Shared Components (Zookeeper, the Portal UI, Prometheus, etc.)
- Visualization (Superset and Grafana)
- System Monitoring (CPU usage, disk, etc.)
Modify the alert conditions to meet your cluster's needs. You can also add new charts/queries and alert conditions as needed by your team.
By default, every chart sets an alert threshold. When a metric exceeds the threshold, Grafana alerts you to a potential problem. For example, a Hydrolix cluster might run 10 query peers. You could configure a threshold to alert you if any one of those 10 query peers become unavailable.
To change an alert threshold:
- Click the dropdown arrow on the a chart you would like to modify.
- Select Edit from the dropdown menu.
- Click the Alert tab to edit the alert threshold.
This chart monitors the number of running query heads and peers. You can write Grafana queries in PromQL. The following example aggregates over a named service to calculate a sum:
This tab describes alert thresholds. The Rule defines the check frequency: every minute in this example. Conditions define the state we want to evaluate. If the condition evaluates to true, Grafana sends an alert. The following example checks if two queries (A, B) dropped below 1 in the last 5 minutes:
To choose notification channels for an alert, edit the Send to option for the alert:
To create a notification channel:
- Click Alerting in the left navigation bar.
- Select the Notifications channels option from the dropdown.
- Click New Channel and choose from the list of notification types. Each type has a different set of configuration settings to define.
Updated 3 months ago