Logging Configuration

Configure Hydrolix application log levels and destinations.

Each Hydrolix cluster runs an instance of Vector which collects logs from Hydrolix applications in the same Kubernetes namespace and transmits the data to the configured destinations.

This page describes how to configure the logging output generated by applications in a Hydrolix cluster.

If you would like to send data to a Hydrolix cluster using Vector, see Vector Integration.

Log collection

Logs are always collected from the Hydrolix cluster's namespace.

To collect application logging from other namespaces in the Kubernetes cluster, list the other namespaces in the vector_extra_namespaces Hydrolix Tunable.

spec:
  vector_extra_namespaces:
  - cnpg-system
  - kube-system

Applications in other namespaces may not uniformly produce JSON log lines. The vector configuration differentiates between two different types of log messages, those that appear to be JSON and those that don't.

Log messages that begin with { are interpreted as JSON. Fields that can't be mapped to common Hydrologs fields are stored in the catchall column.

Log messages that don't begin with { are interpreted as text. The entire log line is stored in the message column.

The original namespace in the Kubernetes cluster is collected. The column kubernetes.pod_namespace in a hydro.logs table contains the namespace name.

In Vector terminology, these are all Kubernetes logs source.

Log destinations

By default, Hydrolix application logs are sent to Hydrologs, stored in each cluster's hydro.logs table, and also stored as compressed files in the object store filesystem.

A pod running an instance of Vector collects logs from all Hydrolix applications in Kubernetes and transmits the data to the configured destinations.

The destination, method of delivery, and choice of authentication are all configurable. The Hydrolix logging configuration supports compressed file delivery to object storage and streaming via an HTTP sink.

DestinationMethod of DeliveryAuthenticationEnabledNotes
Object store filesystemCompressed files stored in primary object storenoyesfiles named log/$POD/$CONTAINER
Local HydrologsLogs streamed to local Hydrolix intake headsnoyeshttp://hydrologs-intake-head:8089/ingest/event
Remote Hydrolix endpointLogs streamed to a remote HTTP endpointrecommendednoExample, another Hydrolix cluster

Any combination of the three types of destinations is a valid configuration.

Log configuration

Configure Hydrolix logs destination in your hydrolixcluster.yaml:

spec:
  logs_sink_type: "http"
  logs_sink_local_url: "http://hydrologs-intake-head:8089/ingest/event"
  logs_http_table: "hydro.logs"
  logs_http_transform: "megaTransform"
  logs_sink_remote_url: ""
  logs_http_remote_table: "hydro.logs"
  logs_http_remote_transform: "megaTransform"
  logs_sink_remote_auth_enabled: false
spec:
  logs_sink_type: string                  # The type of log data sink. Only valid option is "http", although the object storage sink uses type "aws_s3".
  logs_sink_local_url: string             # The full URI to send local HTTP requests containing log data.
  logs_http_table: string                 # An existing Hydrolix <project.table> where the log data should be stored in the local cluster object store.
  logs_http_transform: string             # The transform schema to use for log data ingested into the local cluster.
  logs_sink_remote_url: string            # The full URI to send remote HTTP requests containing log data.
  logs_http_remote_table: string          # An existing Hydrolix <project.table> where the log data should be stored within the remote cluster object store.
  logs_http_remote_transform: string      # The transform schema to use for log data ingested into the remote cluster.
  logs_sink_remote_auth_enabled: boolean  # If true, use basic authentication and credentials from curated Secret values LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD.

Object store filesystem

To suppress storage of the gzip-compressed log data in the object store filesystem, set the Hydrolix tunable disable_vector_bucket_logging to true.

Local Hydrolix endpoint

The local Hydrolix endpoint can't be configured.

Remote Hydrolix endpoint

Specify a remote Hydrolix stream ingestion endpoint.

DestinationDescription
https://{myhost}.hydrolix.live/ingest/eventSend to the default ingest pools of a remote cluster
https://{myhost}.hydrolix.live/pool/{custom_ingest_pool}/ingest/eventSend to a custom ingest pool of a remote cluster. See Resource Pools

Example configurations

Remote HTTP with authentication

spec:
  logs_sink_type: "http"
  logs_http_table: "team_project.cluster_logs"
  logs_http_transform: "custom_transform"
  logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
  logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
  logs_http_remote_transform: "multi_cluster_log_transform"
  logs_sink_remote_auth_enabled: true

Credential management

When sending your logs to an external sink, you will usually need to provide authentication credentials.

Set logs_http_remote_auth_enabled to true and provide basic authentication credentials in the following environment variables:

  • LOGS_HTTP_AUTH_USERNAME
  • LOGS_HTTP_AUTH_PASSWORD

Set these values using a Kubernetes Secret for sensitive information. Use the Kubectl tool for interacting with the Kubernetes Secret store.

Hydrolix uses a secret called curated to hold administratively managed values that are merged into the cluster configuration.

Create or modify the curated secret using one of the following commands

Create a curated secret

When creating an Opaque Secret, use the generic subcommand.

Name the new secret curated where the Hydrolix operator expects it.

kubectl -n ${HDX_KUBERNETES_NAMESPACE} create secret generic curated \
    --from-literal=LOGS_HTTP_AUTH_USERNAME='{username}@{domain}.{tld}' \
    --from-literal=LOGS_HTTP_AUTH_PASSWORD='{password}'

During secret creation, the tooling transparently base64 encodes the values.

Display a curated secret

Confirm the secret was successfully created for the namespace containing your cluster by running the following command:

# -- show only presence of secret, or error if missing
kubectl -n ${HDX_KUBERNETES_NAMESPACE} get secrets curated 
# -- dump the sensitive contents of the secrets
kubectl -n ${HDX_KUBERNETES_NAMESPACE} get secrets curated --output yaml

Edit a curated secret

When editing secrets

kubectl -n ${HDX_KUBERNETES_NAMESPACE} edit secrets curated

Which should open a file with contents similar to:

apiVersion: v1
items:
- apiVersion: v1
  data:
    LOGS_HTTP_AUTH_PASSWORD: c2VrcmV0
    LOGS_HTTP_AUTH_USERNAME: dXNlcg==
  kind: Secret
  metadata:
    creationTimestamp: "2024-11-22T18:34:25Z"
    name: curated
    namespace: {k8s_namespace}
    resourceVersion: "30930857"
    uid: dd0672bc-99b7-485b-a45d-0f78f3c0f6f1
  type: Opaque

The local cluster in which you created your curated secret should now be able to access the remote cluster.

The operator pod collects secrets from curated and merges them into the dynamically generated general secret.

The operator constructs a configuration file for vector, defining the an HTTP sink with the remote endpoint and credentials in vector pod.

Log level

The log_level setting designates custom log levels for the services in a Hydrolix cluster.

Specify a YAML dictionary with Hydrolix service names or the wildcard string “*” as keys. Service name keys take precedence over the default wildcard key.

The example below sets all services that respect this setting to the info level except for stream-head and query-head, which are set to use the critical and trace setting, respectively:

 log_level:
    "*": info
    stream-head: critical
    query-head: trace

Valid log level values:

  • critical
  • error
  • warning
  • info
  • debug
  • trace

Values are case-insensitive.

Below is a list of services that support the log_level setting. Services not on this list ignore this setting.

  • akamai-siem-indexer
  • akamai-siem-peer
  • alter
  • alter-head
  • alter-indexer
  • autoingest
  • batch-head
  • batch-indexer
  • batch-peer
  • decay
  • hdx-scaler
  • intake-api
  • intake-head
  • intake-indexer
  • job-purge
  • kafka-indexer
  • kafka-peer
  • kinesis-indexer
  • kinesis-peer
  • log-vacuum
  • merge
  • merge-cleanup
  • merge-controller
  • merge-head
  • merge-indexer
  • merge-peer
  • partition-vacuum
  • prune-locks
  • query-head
  • query-peer
  • reaper
  • rejects-vacuum
  • stale-job-monitor
  • stream-head
  • stream-indexer
  • stream-peer
  • summary_peer
  • summary-indexer
  • task-monitor
  • turbine-api
  • validator
  • validator-indexer