Hydrologs

Overview

Hydrolix uses Vector to gather log information from its Kubernetes namespace clusters, and stores it in the hydro.logs table. This data shows cluster and pod health, and is a good starting point when troubleshooting or looking for insight into cluster activity.

Considerations when querying `hydro.logs`

This table is intentionally large and wide. It includes hundreds of fields to support deep observability and debugging. It's important to use targeted filters and time bounds to keep your queries fast and efficient.

Common useful fields

Use these table fields to query information from the hydro.logs table:

Field Name	Purpose	Examples
`app`	The service related to the log entry. Pulled from the `hydrolix.io/service` Kubernetes annotation if available. Otherwise falls back to `metadata.labels.app`.	`grafana`, `operator`, `query-peer`
`component`	The internal Hydrolix component responsible for the log entry.	`query_executor`, `root`, `turbine`
`container`	The container associated with the log entry.	`node-exporter`, `query-head`, `merge-peer`
`error`	The error message related to the log entry, if present.	`Unable to merge`, `segfault`, `timeout`
`message`	A description or summary of the log event.	`Processed event`, `Deleted expired cache`, `Error connecting`
`level`	The severity level of the log entry.	`error`, `info`
`pool`	The pool associated with the log entry. Pulled from the `hydrolix.io/pool` Kubernetes annotation.	`query-peer`, `siem-connection`
`stream`	The output stream for the log entry. For example, standard output or error.	`stdout`, `stderr`

Kubernetes table fields

Use these table fields for Kubernetes and cluster-specific queries in the hydro.logs table:

Field Name	Purpose	Examples
`kubernetes.container_id`	Unique ID of the container	`docker://abcdef1234567890`
`kubernetes.container_image`	Image used to run the container	`registry.example.com/myapp:v1.2.3`
`kubernetes.container_name`	Name of the container within the pod	`myapp-container`
`kubernetes.namespace_labels`	Labels applied to the namespace	`{ "team": "platform", "env": "prod" }`
`kubernetes.pod_annotations`	Annotations applied to the pod	`{ "sidecar.istio.io/inject": "true" }`
`kubernetes.pod_ip`	Primary IP address assigned to the pod	`10.1.2.3`
`kubernetes.pod_ips`	All IP addresses assigned to the pod (IPv4 and IPv6)	`[ "10.1.2.3", "fd00::1234" ]`
`kubernetes.pod_labels`	Labels assigned to the pod	`{ "app": "myapp", "tier": "frontend" }`
`kubernetes.pod_name`	Name of the pod	`myapp-deployment-5d8f6b9f4b-jx9kl`
`kubernetes.pod_namespace`	Namespace in which the pod runs	`production`
`kubernetes.pod_node_name`	Name of the node hosting the pod	`gke-cluster-node-1`
`kubernetes.pod_owner`	Owning resource that controls the pod	`ReplicaSet/myapp-deployment-5d8f6b9f4b`

Catchall field

When ingesting data into hydro.logs the catchall column receives data for any unknown input fields. See Catch features for more information.

Use queries against the catchall map datatype column to find strings that include OS and hardware information about clusters. For example catchall['kubernetes.node_labels'].

`ORDER BY LIMIT` optimization

The catchall field also contains log data from the query subsystem about usage of the ORDER BY LIMIT optimization.

Key in `catchall` field	Description
`limit_optimization.level`	Indicates if a `LIMIT` query optimization was applied and which type
`limit_optimization.num_partitions_skipped`	Number of partitions skipped by the optimizer
`limit_optimization.num_partitions_unskippable`	Partitions that could not be skipped

This example shows how to query the catchall key to find optimization information:

SELECT
  catchall['limit_optimization.level'] AS level,
  catchall['limit_optimization.num_partitions_skipped'] AS skipped,
  catchall['limit_optimization.num_partitions_unskippable'] AS unskippable
FROM hydro.logs
WHERE timestamp > NOW() - INTERVAL 1 HOUR
  AND app = 'query-head';