Skip to content

Logging Configuration

Each Hydrolix cluster runs an instance of Vector which collects logs from Hydrolix applications in the same Kubernetes namespace and transmits the data to the configured destinations.

This page describes how to configure the logging output generated by applications in a Hydrolix cluster.

If you would like to send data to your Hydrolix cluster using Vector, see Vector Integration.

Log collection⚓︎

Logs are always collected from the Hydrolix cluster's namespace.

To collect application logging from other namespaces in the Kubernetes cluster, list the other namespaces in the vector_extra_namespaces Hydrolix Tunable.

1
2
3
4
spec:
  vector_extra_namespaces:
  - cnpg-system
  - kube-system

Applications in other namespaces may not uniformly produce JSON log lines. The vector configuration differentiates between two different types of log messages, those that appear to be JSON and those that don't.

Log messages that begin with { are interpreted as JSON. Fields that can't be mapped to common Hydrologs fields are stored in the catchall column.

Log messages that don't begin with { are interpreted as text. The entire log line is stored in the message column.

The original namespace in the Kubernetes cluster is collected. The column kubernetes.pod_namespace in a hydro.logs table contains the namespace name.

In Vector terminology, these are all Kubernetes logs source.

Log destinations⚓︎

By default, Hydrolix application logs are sent to Hydrologs, stored in each cluster's hydro.logs table, and also stored as compressed files in the object store filesystem.

A pod running an instance of Vector collects logs from all Hydrolix applications in Kubernetes and transmits the data to the configured destinations.

The destination, method of delivery, and choice of authentication are all configurable. The Hydrolix logging configuration supports compressed file delivery to object storage and streaming through an HTTP sink.

Destination Method of Delivery Authentication Enabled Notes
Object store filesystem Compressed files stored in primary object store no yes files named log/$POD/$CONTAINER
Local Hydrologs Logs streamed to local Hydrolix intake heads no yes http://hydrologs-intake-head:8089/ingest/event
Remote Hydrolix endpoint Logs streamed to a remote HTTP endpoint recommended no Example, another Hydrolix cluster

Any combination of the three types of destinations is a valid configuration.

Log configuration⚓︎

Configure Hydrolix logs destination in your hydrolixcluster.yaml:

spec:
  logs_sink_type: "http"
  logs_sink_local_url: "http://hydrologs-intake-head:8089/ingest/event"
  logs_http_table: "hydro.logs"
  logs_http_transform: "megaTransform"
  logs_sink_remote_url: ""
  logs_http_remote_table: "hydro.logs"
  logs_http_remote_transform: "megaTransform"
  logs_sink_remote_auth_type: basic
  logs_sink_remote_auth_enabled: false
spec:
  logs_sink_type: string                  # The type of log data sink. Only valid option is "http", although the object storage sink uses type "aws_s3".
  logs_sink_local_url: string             # The full URI to send local HTTP requests containing log data.
  logs_http_table: string                 # An existing Hydrolix <project.table> where the log data should be stored in the local cluster object store.
  logs_http_transform: string             # The transform schema to use for log data ingested into the local cluster.
  logs_sink_remote_url: string            # The full URI to send remote HTTP requests containing log data.
  logs_http_remote_table: string          # An existing Hydrolix <project.table> where the log data should be stored within the remote cluster object store.
  logs_http_remote_transform: string      # The transform schema to use for log data ingested into the remote cluster.
  logs_sink_remote_auth_type: string      # For 'basic' read LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD; for `token`, read LOGS_HTTP_AUTH_TOKEN
  logs_sink_remote_auth_enabled: boolean  # If true, use auth type specified in log_sink_remote_auth_type and corresponding credentials from the curated Secret

Object store filesystem⚓︎

To suppress storage of the gzip-compressed log data in the object store filesystem, set the Hydrolix tunable disable_vector_bucket_logging to true.

Local Hydrolix endpoint⚓︎

The local in-cluster Hydrolix endpoint, http://hydrologs-intake-head:8089/ingest/event, can't be configured.

Remote Hydrolix endpoint⚓︎

Specify a remote Hydrolix stream ingestion endpoint.

Destination Description
https://{myhost}.hydrolix.live/ingest/event Send to the default ingest pools of a remote cluster
https://{myhost}.hydrolix.live/pool/{custom_ingest_pool}/ingest/event Send to a custom ingest pool of a remote cluster. See Resource Pools

Example configurations⚓︎

The fragment below demonstrates the use of HTTP Basic Access Authentication to an HTTP Stream API.

Hydrolix also supports service accounts since v5.4. Follow the Service Accounts How-to to create a long-lifetime auth token.

Remote HTTP with basic auth⚓︎

1
2
3
4
5
6
7
8
9
spec:
  logs_sink_type: "http"
  logs_http_table: "team_project.cluster_logs"
  logs_http_transform: "custom_transform"
  logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
  logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
  logs_http_remote_transform: "multi_cluster_log_transform"
  logs_sink_remote_auth_type: basic
  logs_sink_remote_auth_enabled: true

Credential management⚓︎

When sending your logs to an external sink, you will usually need to provide authentication credentials.

Hydrolix uses a secret called curated to hold administratively managed values that are merged into the cluster configuration.

Set these values using a Kubernetes Secret. Use the Kubectl tool for interacting with the Kubernetes Secret store.

Create or modify the curated secret using one of the following commands and add the required variables, depending on your selection of basic or token auth type.

Use basic authentication credentials⚓︎

This configures vector to present HTTP Basic Access Authentication in the Authorization header when connecting to the remote log endpoint.

  1. Set logs_http_remote_auth_enabled to true.
  2. (optional) Set logs_http_remote_auth_type explicitly to basic, which is the default.
  3. Install the basic authentication credentials in the curated secret in varables named LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD.

Use OAuth2 bearer token⚓︎

This configures vector to present an OAuth 2.0 bearer token in the Authorization header when connecting to the remote log endpoint.

  1. Set logs_http_remote_auth_enabled to true.
  2. Set logs_http_remote_auth_type to token.
  3. Install the token into the curated secret in a variable named LOGS_HTTP_AUTH_TOKEN.

Create a curated secret⚓︎

When creating an Opaque Secret, use the generic subcommand.

Name the new secret curated where the Hydrolix operator expects it.

1
2
3
kubectl -n ${HDX_KUBERNETES_NAMESPACE} create secret generic curated \
    --from-literal=LOGS_HTTP_AUTH_USERNAME='{username}@{domain}.{tld}' \
    --from-literal=LOGS_HTTP_AUTH_PASSWORD='{password}'

During secret creation, the tooling transparently base64 encodes the values.

Display a curated secret⚓︎

Confirm the secret was successfully created for the namespace containing your cluster by running the following command:

# -- show only presence of secret, or error if missing
kubectl -n ${HDX_KUBERNETES_NAMESPACE} get secrets curated 
# -- dump the sensitive contents of the secrets
kubectl -n ${HDX_KUBERNETES_NAMESPACE} get secrets curated --output yaml

Edit a curated secret⚓︎

When editing secrets

kubectl -n ${HDX_KUBERNETES_NAMESPACE} edit secrets curated

Which should open a file with contents similar to:

apiVersion: v1
items:
- apiVersion: v1
  data:
    LOGS_HTTP_AUTH_PASSWORD: c2VrcmV0
    LOGS_HTTP_AUTH_USERNAME: dXNlcg==
    LOGS_HTTP_AUTH_TOKEN: ZXlKaGJHY2lPaUpGWkVSVFFTSXNJblI1Y0NJNklrcFhWQ0o5LmV5SnBjM01pT2lKb2RIUndjem92TDJSdlkzTXRjMkZ1WkdKdmVDNW9lV1J5YjJ4cGVDNWtaWFl2WTI5dVptbG5JaXdpWVhWa0lqb2lZMjl1Wm1sbkxXRndhU0lzSW5OMVlpSTZJbVV5WkdNeFpERXhMVFprT1RBdE5HRmtOQzA1WldNMkxXRTJNbU00WVRZNE9XUTJOaUlzSW1saGRDSTZNVGMxTnpBd05Ea3hOeTQzT0RnM056VXNJbVY0Y0NJNk1UYzRPRFUwTURreE55NDNNVFE0TlRRc0ltcDBhU0k2SWpVaWZRLmFWN0x4RXhYUFNjV2p4NnAwM0tuMTZ2T1U5RDJtIC0gcGhGT2VJeHFZdTBrSXFxS1dOMVdMQ2poV1hiYmhrYVRpMTBEclhrQTM0bFFheFpsakR1blRiQ3c=
  kind: Secret
  metadata:
    creationTimestamp: "2024-11-22T18:34:25Z"
    name: curated
    namespace: {k8s_namespace}
    resourceVersion: "30930857"
    uid: dd0672bc-99b7-485b-a45d-0f78f3c0f6f1
  type: Opaque

The operator pod collects secrets from curated and merges them into the dynamically-generated general secret.

Then it constructs a configuration file for the vector application and pod, defining an HTTP sink including the remote endpoint and credentials.

Log level⚓︎

The log_level setting designates custom log levels for the services in a Hydrolix cluster.

Specify a YAML dictionary with Hydrolix service names or the wildcard string “*” as keys. Service name keys take precedence over the default wildcard key.

The example below sets all services that respect this setting to the info level except for stream-head and query-head, which are set to use the critical and trace setting, respectively:

1
2
3
4
 log_level:
    "*": info
    stream-head: critical
    query-head: trace

Valid log level values:

  • critical
  • error
  • warning
  • info
  • debug
  • trace

Values are case-insensitive.

Below is a list of services that support the log_level setting. Services not on this list ignore this setting.

  • akamai-siem-indexer
  • akamai-siem-peer
  • alter
  • alter-head
  • alter-indexer
  • autoingest
  • batch-head
  • batch-indexer
  • batch-peer
  • decay
  • hdx-scaler
  • intake-api
  • intake-head
  • intake-indexer
  • job-purge
  • kafka-indexer
  • kafka-peer
  • kinesis-indexer
  • kinesis-peer
  • log-vacuum
  • merge
  • merge-cleanup
  • merge-controller
  • merge-head
  • merge-indexer
  • merge-peer
  • partition-vacuum
  • prune-locks
  • query-head
  • query-peer
  • reaper
  • rejects-vacuum
  • stale-job-monitor
  • stream-head
  • stream-indexer
  • stream-peer
  • summary_peer
  • summary-indexer
  • task-monitor
  • turbine-api
  • validator
  • validator-indexer