Alerts

How to setup alert for Hydrolix infrastructure

Alerting in Grafana

Hydrolix relies on Grafana and Prometheus for observability of the infrastructure. They are already configured to talk to each other as part of your deployment and we also provide a default alert dashboard for you to customise to your specific needs (i.e changing threshold values).

To access, log into Grafana and find the dashboard Platform Alerts in the Hydrolix Monitoring folder:

You will see a default dashboard with pre-built charts, queries and alert condition thresholds. You will need to modify the alert conditions to match your deployment. You can also add new charts/queries and alert conditions as needed by your team.

We split up the dashboard into different alerting component groups:

  • Batch Ingest
  • Kafka Ingest
  • Streaming Ingest
  • Merge
  • Query
  • Shared Components - such as zookeeper, Web UI, prometheus etc
  • Visualisation - Superset and Grafana
  • System Monitoring - overall CPU usage, disk etc...

The full list of components is here

Modify Threshold

All our charts have a default threshold that may need to be modified based on your needs.
For example, if you have 10 query peers running and want to be alerted if one is not available.
You should modify the chart Query Server Count
Click on Edit for the Chart like the following

This will open a view similar to this one:

Here we are monitoring the number of query heads and query peers running. Each query is written using PromQL syntax. In the example above we are aggregating over a named service and calculating the total running sum. Also note by default, each query gets a name i.e A, B, etc. You can change the names to make it more clear if needed.

To customize the alert (e.g change a threshold value) click on Alert

This is where we describe the Alert. The Rule has a name and defines the frequency of checking, every minute in this example. Conditions defines the state we want to evaluate. If the condition evaluates to true an alert is fired. In the above example we are checking both queries A OR B for a drop below 1 in a 5m window.

You can customise each chart like this based on your deployment/infrastructure scale. Now that we have the proper threshold, we can now define a notification channel.

Configure Alert Notification

To configure a notification channel, select Alerting in the left navigation bar, and click on Notifications Channels

Click on New Channel and choose from the list of notification types. Each type has a different set of configuration settings you will need to define.

Once a notification channel has been created. It will become available on the alert Send to by clicking the + symbol and selecting the notification channel you want.


Did this page help you?