Logging Configuration

Each Hydrolix cluster runs an instance of Vector which collects logs from Hydrolix applications in the same Kubernetes namespace and transmits the data to the configured destinations.

This page describes how to configure the logging output generated by applications in a Hydrolix cluster.

If you would like to send data to your Hydrolix cluster using Vector, see Vector Integration.

Log collection⚓︎

Logs are always collected from the Hydrolix cluster's namespace.

To collect application logging from other namespaces in the Kubernetes cluster, list the other namespaces in the vector_extra_namespaces Hydrolix Tunable.

spec:
  vector_extra_namespaces:
  - cnpg-system
  - kube-system

Applications in other namespaces may not uniformly produce JSON log lines. The vector configuration differentiates between two different types of log messages, those that appear to be JSON and those that don't.

Log messages that begin with { are interpreted as JSON. Fields that can't be mapped to common Hydrologs fields are stored in the catchall column.

Log messages that don't begin with { are interpreted as text. The entire log line is stored in the message column.

The original namespace in the Kubernetes cluster is collected. The column kubernetes.pod_namespace in a hydro.logs table contains the namespace name.

In Vector terminology, these are all Kubernetes logs source.

Log destinations⚓︎

By default, Hydrolix application logs are sent to Hydrologs, stored in each cluster's hydro.logs table, and also stored as compressed files in the object store filesystem.

A pod running an instance of Vector collects logs from all Hydrolix applications in Kubernetes and transmits the data to the configured destinations.

The destination, method of delivery, and choice of authentication are all configurable. The Hydrolix logging configuration supports compressed file delivery to object storage and streaming through an HTTP sink.

Destination	Method of Delivery	Authentication	Enabled	Notes
Object store filesystem	Compressed files stored in primary object store	no	yes	files named log/$POD/$CONTAINER
Local Hydrologs	Logs streamed to local Hydrolix intake heads	no	yes	`http://hydrologs-intake-head:8089/ingest/event`
Remote Hydrolix endpoint	Logs streamed to a remote HTTP endpoint	recommended	no	Example, another Hydrolix cluster

Any combination of the three types of destinations is a valid configuration.

Log configuration⚓︎

Configure Hydrolix logs destination in your hydrolixcluster.yaml:

config with default valuestypes and descriptions

spec:
  logs_sink_type: "http"
  logs_sink_local_url: "http://hydrologs-intake-head:8089/ingest/event"
  logs_http_table: "hydro.logs"
  logs_http_transform: "megaTransform"
  logs_sink_remote_url: ""
  logs_http_remote_table: "hydro.logs"
  logs_http_remote_transform: "megaTransform"
  logs_sink_remote_auth_type: basic
  logs_sink_remote_auth_enabled: false

spec:
  logs_sink_type: string                  # The type of log data sink. Only valid option is "http", although the object storage sink uses type "aws_s3".
  logs_sink_local_url: string             # The full URI to send local HTTP requests containing log data.
  logs_http_table: string                 # An existing Hydrolix <project.table> where the log data should be stored in the local cluster object store.
  logs_http_transform: string             # The transform schema to use for log data ingested into the local cluster.
  logs_sink_remote_url: string            # The full URI to send remote HTTP requests containing log data.
  logs_http_remote_table: string          # An existing Hydrolix <project.table> where the log data should be stored within the remote cluster object store.
  logs_http_remote_transform: string      # The transform schema to use for log data ingested into the remote cluster.
  logs_sink_remote_auth_type: string      # For 'basic' read LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD; for `token`, read LOGS_HTTP_AUTH_TOKEN
  logs_sink_remote_auth_enabled: boolean  # If true, use auth type specified in log_sink_remote_auth_type and corresponding credentials from the curated Secret

Object store filesystem⚓︎

To suppress storage of the gzip-compressed log data in the object store filesystem, set the Hydrolix tunable disable_vector_bucket_logging to true.

Local Hydrolix endpoint⚓︎

The local in-cluster Hydrolix endpoint, http://hydrologs-intake-head:8089/ingest/event, can't be configured.

Remote Hydrolix endpoint⚓︎

Specify a remote Hydrolix stream ingestion endpoint.

Destination	Description
`https://{myhost}.hydrolix.live/ingest/event`	Send to the default ingest pools of a remote cluster
`https://{myhost}.hydrolix.live/pool/{custom_ingest_pool}/ingest/event`	Send to a custom ingest pool of a remote cluster. See Resource Pools

Example configurations⚓︎

The fragment below demonstrates the use of HTTP Basic Access Authentication to an HTTP Stream API.

Hydrolix also supports service accounts since v5.4. Follow the Service Accounts How-to to create a long-lifetime auth token.

Remote HTTP with basic auth⚓︎

hydrolixcluster.yaml

spec:
  logs_sink_type: "http"
  logs_http_table: "team_project.cluster_logs"
  logs_http_transform: "custom_transform"
  logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
  logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
  logs_http_remote_transform: "multi_cluster_log_transform"
  logs_sink_remote_auth_type: basic
  logs_sink_remote_auth_enabled: true

Credential management⚓︎

When sending your logs to an external sink, you will usually need to provide authentication credentials.

Hydrolix uses a secret called curated to hold administratively managed values that are merged into the cluster configuration.

Set these values using a Kubernetes Secret. Use the Kubectl tool for interacting with the Kubernetes Secret store.

Create or modify the curated secret using one of the following commands and add the required variables, depending on your selection of basic or token auth type.

Use basic authentication credentials⚓︎

This configures vector to present HTTP Basic Access Authentication in the Authorization header when connecting to the remote log endpoint.

Set logs_http_remote_auth_enabled to true.
(optional) Set logs_http_remote_auth_type explicitly to basic, which is the default.
Install the basic authentication credentials in the curated secret in varables named LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD.

Use OAuth2 bearer token⚓︎

This configures vector to present an OAuth 2.0 bearer token in the Authorization header when connecting to the remote log endpoint.

Set logs_http_remote_auth_enabled to true.
Set logs_http_remote_auth_type to token.
Install the token into the curated secret in a variable named LOGS_HTTP_AUTH_TOKEN.

Create a curated secret⚓︎

When creating an Opaque Secret, use the generic subcommand.

Name the new secret curated where the Hydrolix operator expects it.

created 'curated' Kubernetes secret

kubectl -n ${HDX_KUBERNETES_NAMESPACE} create secret generic curated \
    --from-literal=LOGS_HTTP_AUTH_USERNAME='{username}@{domain}.{tld}' \
    --from-literal=LOGS_HTTP_AUTH_PASSWORD='{password}'

During secret creation, the tooling transparently base64 encodes the values.

Display a curated secret⚓︎

Confirm the secret was successfully created for the namespace containing your cluster by running the following command:

display 'curated' secretdisplay contents of 'curated'

# -- show only presence of secret, or error if missing
kubectl -n ${HDX_KUBERNETES_NAMESPACE} get secrets curated 

# -- dump the sensitive contents of the secrets
kubectl -n ${HDX_KUBERNETES_NAMESPACE} get secrets curated --output yaml

Edit a curated secret⚓︎

When editing secrets

view 'curated' Kubernetes secret

kubectl -n ${HDX_KUBERNETES_NAMESPACE} edit secrets curated

Which should open a file with contents similar to:

curated secret

apiVersion: v1
items:
- apiVersion: v1
  data:
    LOGS_HTTP_AUTH_PASSWORD: c2VrcmV0
    LOGS_HTTP_AUTH_USERNAME: dXNlcg==
    LOGS_HTTP_AUTH_TOKEN: ZXlKaGJHY2lPaUpGWkVSVFFTSXNJblI1Y0NJNklrcFhWQ0o5LmV5SnBjM01pT2lKb2RIUndjem92TDJSdlkzTXRjMkZ1WkdKdmVDNW9lV1J5YjJ4cGVDNWtaWFl2WTI5dVptbG5JaXdpWVhWa0lqb2lZMjl1Wm1sbkxXRndhU0lzSW5OMVlpSTZJbVV5WkdNeFpERXhMVFprT1RBdE5HRmtOQzA1WldNMkxXRTJNbU00WVRZNE9XUTJOaUlzSW1saGRDSTZNVGMxTnpBd05Ea3hOeTQzT0RnM056VXNJbVY0Y0NJNk1UYzRPRFUwTURreE55NDNNVFE0TlRRc0ltcDBhU0k2SWpVaWZRLmFWN0x4RXhYUFNjV2p4NnAwM0tuMTZ2T1U5RDJtIC0gcGhGT2VJeHFZdTBrSXFxS1dOMVdMQ2poV1hiYmhrYVRpMTBEclhrQTM0bFFheFpsakR1blRiQ3c=
  kind: Secret
  metadata:
    creationTimestamp: "2024-11-22T18:34:25Z"
    name: curated
    namespace: {k8s_namespace}
    resourceVersion: "30930857"
    uid: dd0672bc-99b7-485b-a45d-0f78f3c0f6f1
  type: Opaque

The operator pod collects secrets from curated and merges them into the dynamically-generated general secret.

Then it constructs a configuration file for the vector application and pod, defining an HTTP sink including the remote endpoint and credentials.

Log level⚓︎

The log_level setting designates custom log levels for the services in a Hydrolix cluster.

Specify a YAML dictionary with Hydrolix service names or the wildcard string “*” as keys. Service name keys take precedence over the default wildcard key.

The example below sets all services that respect this setting to the info level except for stream-head and query-head, which are set to use the critical and trace setting, respectively:

 log_level:
    "*": info
    stream-head: critical
    query-head: trace

Valid log level values:

critical
error
warning
info
debug
trace

Values are case-insensitive.

Below is a list of services that support the log_level setting. Services not on this list ignore this setting.

akamai-siem-indexer
akamai-siem-peer
alter
alter-head
alter-indexer
autoingest
batch-head
batch-indexer
batch-peer
decay
hdx-scaler
intake-api
intake-head
intake-indexer
job-purge
kafka-indexer
kafka-peer
kinesis-indexer
kinesis-peer
log-vacuum
merge
merge-cleanup
merge-controller
merge-head
merge-indexer
merge-peer
partition-vacuum
prune-locks
query-head
query-peer
reaper
rejects-vacuum
stale-job-monitor
stream-head
stream-indexer
stream-peer
summary_peer
summary-indexer
task-monitor
turbine-api
validator
validator-indexer