Skip to content

Intake Pool Monitoring

Heartbeat monitoring for multiple ingest pools was introduced in Hydrolix version 5.7.4.

Monitor data ingestion health across multiple intake pools using the monitor-ingest service and hydro.monitor table.

The monitor-ingest service provides continuous health monitoring for intake pools by sending heartbeat events at one-second intervals. This service monitors multiple intake pools simultaneously, providing visibility into ingestion pipeline health across the entire cluster.

When to use intake pool monitoring⚓︎

As clusters scale horizontally with additional intake pools to distribute load and improve performance, monitoring only the primary pool creates information gaps. Data drops or ingestion failures in secondary pools can go undetected without comprehensive intake pool monitoring.

Intake pool monitoring addresses this by:

  • Maintaining visibility as intake pools are added for scaling
  • Detecting data drops across all ingestion endpoints
  • Identifying pool-specific issues before they impact operations
  • Supporting selective monitoring through pool exemptions

Heartbeat mechanism⚓︎

The monitor-ingest service uses this workflow:

  1. Sends HTTP POST requests to /pool/{pool_name}/ingest/event once per second and receives responses
  2. Auto-discovers intake pools based on service type
  3. Records heartbeat data in the hydro.monitor table
  4. Excludes exempted pools configured in monitor_ingest_pool_exemptions

Pool discovery logic⚓︎

The service automatically monitors pools where the service type is intake-head.

Pools with other service types or pools listed in monitor_ingest_pool_exemptions are excluded from monitoring.

Pool selection example⚓︎

Pool Discovery Logic
1
2
3
4
5
6
pools = []
for pool in cfg.all_pools_list:
    if (pool_name not in cfg.monitor_ingest_pool_exemptions
        and pool["service"] == "intake-head"
        and pool_name not in pools):
        pools.append(pool_name)

The hydro.monitor table⚓︎

Heartbeat data is stored in the hydro.monitor system table with this schema:

Field Type Description Indexed?
timestamp datetime The time at which the monitor-ingest service recorded the heartbeat data into the hydro.monitor table. This is the primary timestamp field. Primary
pod_ip string Identifies which monitor-ingest pod sent the heartbeat (useful in multi-replica deployments) Not indexed
monitor_request_timestamp datetime Records when the HTTP request was initiated, enabling latency calculations Indexed
intake_pool string Identifies which intake pool received the heartbeat request. This field was added in Hydrolix version 5.7.4 to support monitoring multiple intake pools. Not indexed

Operational considerations⚓︎

Before deploying intake pool monitoring, review these operational considerations and technical limitations.

Monitoring overhead⚓︎

  • Heartbeat frequency: 1 heartbeat per second per monitored pool
  • Data volume: ~86,400 events per pool per day
  • Impact: Monitoring 10 pools generates ~864,000 events daily
  • Recommendation: Use monitor_ingest_pool_exemptions to limit monitoring to critical pools

Pool autodiscovery⚓︎

  • Only monitors intake-head service type: Pools must have service type intake-head
  • Other pool types: Silently excluded from automatic monitoring
  • Manual pool selection: Use monitor_ingest_pool_exemptions to exclude specific pools; inverse selection not supported

Data retention⚓︎

  • Historical monitor data follows standard table retention policies
  • Queries over long time periods may be slow without appropriate time filters
  • Set custom retention for hydro.monitor if disk space is limited

Pre-v5.7.4 data⚓︎

  • The intake_pool column doesn't exist in data written before Hydrolix version 5.7.4
  • Clusters upgraded from earlier versions will have NULL values for historical data
  • This is expected and doesn't impact current monitoring

Next steps⚓︎

To query heartbeat data, configure monitoring, and set up alerts, see Use intake pool monitoring.