Logging Configuration

Configure the log level and destination(s) for your Hydrolix logs.

Log Level

The log_level setting designates custom log levels for the services in a Hydrolix cluster.

Specify a YAML dictionary with Hydrolix service names or the wildcard string “*” as keys. Service name keys take precedence over the default wildcard key.

The example below sets all services that respect this setting to the info level except for stream-head and query-head, which are set to use the critical and trace setting, respectively:

 log_level:
    "*": info
    stream-head: critical
    query-head: trace

Valid log level values:

  • critical
  • error
  • warning
  • info
  • debug
  • trace

Values are case-insensitive.

Below is a list of services that support the log_level setting. Services not on this list ignore this setting.

  • akamai-siem-indexer
  • akamai-siem-peer
  • alter
  • alter-head
  • alter-indexer
  • autoingest
  • batch-head
  • batch-indexer
  • batch-peer
  • decay
  • intake-api
  • intake-head
  • job-purge
  • kafka-indexer
  • kafka-peer
  • kinesis-indexer
  • kinesis-peer
  • log-vacuum
  • merge
  • merge-cleanup
  • merge-controller
  • merge-head
  • merge-indexer
  • merge-peer
  • partition-vacuum
  • prune-locks
  • query-head
  • query-peer
  • reaper
  • rejects-vacuum
  • stale-job-monitor
  • stream-head
  • stream-indexer
  • stream-peer
  • summary_peer
  • summary-indexer
  • task-monitor
  • turbine-api
  • validator
  • validator-indexer

Log Destinations

By default, logs generated by Hydrolix are stored in the cluster's default object store within the hydro.logs table. The method of delivery is via a locally-deployed Vector instance which places log data on an internal Redpanda queue. Configuring how you want your Hydrolix logs routed generally comes down to whether you would like your logs stored in the local cluster's object store only -- as is done by default -- or whether you would like logs to be stored in both the local and a remote Hydrolix cluster's object store. The latter "dual sink" mode uses HTTPS for transport and is more configurable, allowing dual outputs with different table and transform destinations.

The destination, method of delivery, and whether to use remote cluster authentication for Hydrolix logs are all configurable. There are two different "sink" types for the vector instance which specify the method of log data delivery (kafka or http) and four possible configurations:

Destination(s)Method of DeliveryAuthentication
Local cluster object storeRouted through a local Redpanda queue via a kafka sinkno
Local cluster object storeRouted through the local Intake-heads via an http sinkno
Local cluster object store and remote cluster object storeRouted through local and external Intake-heads via http sinks. This option allows you to send a copy of your Hydrolix cluster's log data to another Hydrolix cluster.no
Local cluster object store and remote cluster object storeRouted through local and external Intake-heads via http sinks . This option allows you to send a copy of your Hydrolix cluster's log data to another Hydrolix cluster.yes

This means there are two options for method of delivery to an internal sink. These options are:

  1. A Redpanda stream (default)
  2. The cluster's local HTTP ingest endpoint: http://stream-head:8089/ingest/event

For the external sink, the only option for the method of delivery is to send over http. However, there are 2 distinct ways to specify your destination:

DestinationDescription
https://{remote_hdx_host}/ingest/eventSend to the default ingest pools of a remote cluster
https://{remote_hdx_host}/pool/{custom-ingest-pool}/ingest/eventSend to a custom ingest pool of a remote cluster. You can read more about creating custom ingest pools in our Resource Pools documentation.

🚧

Incorporating an external sink requires both sinks to be http

If you specify logging to an external sink via http, the internal sink must also be configured to be http. You cannot send your Hydrolix log data to both the internal Redpanda queue and a remote HTTP sink.

Configuration

You can modify your Hydrolix logs destination and related configuration within your hydrolixcluster.yaml:

spec:
  logs_sink_type: "kafka" # this places the log data on the local cluster's redpanda queue, not a kafka queue
  logs_sink_local_url: "http://stream-head:8089/ingest/event"
  logs_sink_remote_url: ""
  logs_sink_remote_auth_enabled: false
  logs_http_remote_table: "hydro.logs"
  logs_http_remote_transform: "megaTransform"
  logs_http_table: "hydro.logs"
  logs_http_transform: "megaTransform"
spec:
  logs_sink_type: # The type of log data sink. Valid options are kafka or http.
  logs_sink_local_url: # The full URI to send local HTTP requests containing log data.
  logs_sink_remote_url: # The full URI to send remote HTTP requests containing log data.
  logs_sink_remote_auth_enabled: # When enabled, remote HTTP will use basic authentication from a curated secret. Note that enabling this option requires providing basic auth via the environment variables LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD.
  logs_http_remote_table: # An existing Hydrolix <project.table> where the log data should be stored within the remote cluster object store.
  logs_http_remote_transform: # The transform schema to use for log data ingested into the remote cluster.
  logs_http_table: # An existing Hydrolix <project.table> where the log data should be stored in the local cluster object store.
  logs_http_transform: # The transform schema to use for log data ingested into the local cluster.
spec:
  logs_sink_type: string
  logs_sink_local_url: string
  logs_sink_remote_url: string
  logs_sink_remote_auth_enabled: boolean
  logs_http_remote_table: string
  logs_http_remote_transform: string
  logs_http_table: string
  logs_http_transform: string

Example Configurations

The following are example configurations for each of the four configuration options.

Local Redpanda (default)

Because this is the default state of every Hydrolix deployment, there is no additional configuration needed for Hydrolix logs to be routed via a Redpanda queue to the cluster's default object store.

Local HTTP

spec:
  logs_sink_type: "http"
  logs_http_table: "team_project.cluster_logs"
  logs_http_transform: "custom_transform"

Local HTTP and Remote Cluster over HTTP without Authentication

spec:
  logs_sink_type: "http"
  logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
  logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
  logs_http_remote_transform: "multi_cluster_log_transform"
  logs_http_table: "team_project.cluster_logs"
  logs_http_transform: "custom_transform"

Local HTTP and Remote Cluster over HTTP with Authentication

spec:
  logs_sink_type: "http"
  logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
  logs_sink_remote_auth_enabled: true
  logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
  logs_http_remote_transform: "multi_cluster_log_transform"
  logs_http_table: "team_project.cluster_logs"
  logs_http_transform: "custom_transform"

Basic Authentication

If you are sending your logs to an external sink, you will need to provide authentication if you have configured your cluster to use authentication. You can do so by enabling logs_http_auth_enabled and providing basic authentication credentials (username + password). You can set your username and password using the following environment variables:

  • LOGS_HTTP_AUTH_USERNAME
  • LOGS_HTTP_AUTH_PASSWORD

You can set these values using a Kubernetes Secret as the object for storing sensitive information and the tool Kubectl for interacting with the Kubernetes Secret object store. Create a secret called curated using the following command, filling in the appropriate username and password for accessing the remote cluster:

kubectl create secret generic curated \
    --from-literal=LOGS_HTTP_AUTH_USERNAME='{username}@{domain}.{tld}' \
    --from-literal=LOGS_HTTP_AUTH_PASSWORD='{password}'

You can then confirm the secret was successfully created for the namespace containing your cluster by running the following command:

k edit secrets --namespace ${HDX_KUBERNETES_NAMESPACE}

which should produce a result similar to:

apiVersion: v1
items:
- apiVersion: v1
  data:
    LOGS_HTTP_AUTH_PASSWORD: c2VrcmV0
    LOGS_HTTP_AUTH_USERNAME: dXNlcg==
  kind: Secret
  metadata:
    creationTimestamp: "2024-11-22T18:34:25Z"
    name: curated
    namespace: {k8s_namespace}
    resourceVersion: "30930857"
    uid: dd0672bc-99b7-485b-a45d-0f78f3c0f6f1
  type: Opaque

The local cluster in which you created your curated secret should now be able to access the remote cluster. The operator pod will aggregate all the values from the curated secret and merge them into the dynamically generated general secret. This general secret is loaded by the Vector pod, giving it access to the remote cluster's credentials.