Logging Configuration
Configure the log level and destination(s) for your Hydrolix logs.
Log Level
The log_level
setting designates custom log levels for the services in a Hydrolix cluster.
Specify a YAML dictionary with Hydrolix service names or the wildcard string “*”
as keys. Service name keys take precedence over the default wildcard key.
The example below sets all services that respect this setting to the info
level except for stream-head
and query-head
, which are set to use the critical
and trace
setting, respectively:
log_level:
"*": info
stream-head: critical
query-head: trace
Valid log level values:
critical
error
warning
info
debug
trace
Values are case-insensitive.
Below is a list of services that support the log_level
setting. Services not on this list ignore this setting.
akamai-siem-indexer
akamai-siem-peer
alter
alter-head
alter-indexer
autoingest
batch-head
batch-indexer
batch-peer
decay
intake-api
intake-head
intake-indexer
job-purge
kafka-indexer
kafka-peer
kinesis-indexer
kinesis-peer
log-vacuum
merge
merge-cleanup
merge-controller
merge-head
merge-indexer
merge-peer
partition-vacuum
prune-locks
query-head
query-peer
reaper
rejects-vacuum
stale-job-monitor
stream-head
stream-indexer
stream-peer
summary_peer
summary-indexer
task-monitor
turbine-api
validator
validator-indexer
Log Destinations
By default, logs generated by Hydrolix are stored in the cluster's default object store within the hydro.logs
table. The method of delivery is via a locally-deployed Vector instance which places log data on an internal Redpanda queue. Configuring how you want your Hydrolix logs routed generally comes down to whether you would like your logs stored in the local cluster's object store only -- as is done by default -- or whether you would like logs to be stored in both the local and a remote Hydrolix cluster's object store. The latter "dual sink" mode uses HTTPS for transport and is more configurable, allowing dual outputs with different table and transform destinations.
The destination, method of delivery, and whether to use remote cluster authentication for Hydrolix logs are all configurable. There are two different "sink" types for the vector instance which specify the method of log data delivery (kafka
or http
) and four possible configurations:
Destination(s) | Method of Delivery | Authentication |
---|---|---|
Local cluster object store | Routed through a local Redpanda queue via a kafka sink | no |
Local cluster object store | Routed through the local Intake-heads via an http sink | no |
Local cluster object store and remote cluster object store | Routed through local and external Intake-heads via http sinks. This option allows you to send a copy of your Hydrolix cluster's log data to another Hydrolix cluster. | no |
Local cluster object store and remote cluster object store | Routed through local and external Intake-heads via http sinks . This option allows you to send a copy of your Hydrolix cluster's log data to another Hydrolix cluster. | yes |
This means there are two options for method of delivery to an internal sink. These options are:
- A Redpanda stream (default)
- The cluster's local HTTP ingest endpoint:
http://stream-head:8089/ingest/event
For the external sink, the only option for the method of delivery is to send over http
. However, there are 2 distinct ways to specify your destination:
Destination | Description |
---|---|
https://{remote_hdx_host}/ingest/event | Send to the default ingest pools of a remote cluster |
https://{remote_hdx_host}/pool/{custom-ingest-pool}/ingest/event | Send to a custom ingest pool of a remote cluster. You can read more about creating custom ingest pools in our Resource Pools documentation. |
Incorporating an external sink requires both sinks to be
http
If you specify logging to an external sink via
http
, the internal sink must also be configured to behttp
. You cannot send your Hydrolix log data to both the internal Redpanda queue and a remote HTTP sink.
Configuration
You can modify your Hydrolix logs destination and related configuration within your hydrolixcluster.yaml
:
spec:
logs_sink_type: "kafka" # this places the log data on the local cluster's redpanda queue, not a kafka queue
logs_sink_local_url: "http://stream-head:8089/ingest/event"
logs_sink_remote_url: ""
logs_sink_remote_auth_enabled: false
logs_http_remote_table: "hydro.logs"
logs_http_remote_transform: "megaTransform"
logs_http_table: "hydro.logs"
logs_http_transform: "megaTransform"
spec:
logs_sink_type: # The type of log data sink. Valid options are kafka or http.
logs_sink_local_url: # The full URI to send local HTTP requests containing log data.
logs_sink_remote_url: # The full URI to send remote HTTP requests containing log data.
logs_sink_remote_auth_enabled: # When enabled, remote HTTP will use basic authentication from a curated secret. Note that enabling this option requires providing basic auth via the environment variables LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD.
logs_http_remote_table: # An existing Hydrolix <project.table> where the log data should be stored within the remote cluster object store.
logs_http_remote_transform: # The transform schema to use for log data ingested into the remote cluster.
logs_http_table: # An existing Hydrolix <project.table> where the log data should be stored in the local cluster object store.
logs_http_transform: # The transform schema to use for log data ingested into the local cluster.
spec:
logs_sink_type: string
logs_sink_local_url: string
logs_sink_remote_url: string
logs_sink_remote_auth_enabled: boolean
logs_http_remote_table: string
logs_http_remote_transform: string
logs_http_table: string
logs_http_transform: string
Example Configurations
The following are example configurations for each of the four configuration options.
Local Redpanda (default)
Because this is the default state of every Hydrolix deployment, there is no additional configuration needed for Hydrolix logs to be routed via a Redpanda queue to the cluster's default object store.
Local HTTP
spec:
logs_sink_type: "http"
logs_http_table: "team_project.cluster_logs"
logs_http_transform: "custom_transform"
Local HTTP and Remote Cluster over HTTP without Authentication
spec:
logs_sink_type: "http"
logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
logs_http_remote_transform: "multi_cluster_log_transform"
logs_http_table: "team_project.cluster_logs"
logs_http_transform: "custom_transform"
Local HTTP and Remote Cluster over HTTP with Authentication
spec:
logs_sink_type: "http"
logs_sink_remote_url: "https://company.hydrolix.live/ingest/event"
logs_sink_remote_auth_enabled: true
logs_http_remote_table: "multi_cluster_monitoring.shared_logs"
logs_http_remote_transform: "multi_cluster_log_transform"
logs_http_table: "team_project.cluster_logs"
logs_http_transform: "custom_transform"
Basic Authentication
If you are sending your logs to an external sink, you will need to provide authentication if you have configured your cluster to use authentication. You can do so by enabling logs_http_auth_enabled
and providing basic authentication credentials (username + password). You can set your username and password using the following environment variables:
LOGS_HTTP_AUTH_USERNAME
LOGS_HTTP_AUTH_PASSWORD
You can set these values using a Kubernetes Secret as the object for storing sensitive information and the tool Kubectl for interacting with the Kubernetes Secret object store. Create a secret called curated
using the following command, filling in the appropriate username and password for accessing the remote cluster:
kubectl create secret generic curated \
--from-literal=LOGS_HTTP_AUTH_USERNAME='{username}@{domain}.{tld}' \
--from-literal=LOGS_HTTP_AUTH_PASSWORD='{password}'
You can then confirm the secret was successfully created for the namespace containing your cluster by running the following command:
k edit secrets --namespace ${HDX_KUBERNETES_NAMESPACE}
which should produce a result similar to:
apiVersion: v1
items:
- apiVersion: v1
data:
LOGS_HTTP_AUTH_PASSWORD: c2VrcmV0
LOGS_HTTP_AUTH_USERNAME: dXNlcg==
kind: Secret
metadata:
creationTimestamp: "2024-11-22T18:34:25Z"
name: curated
namespace: {k8s_namespace}
resourceVersion: "30930857"
uid: dd0672bc-99b7-485b-a45d-0f78f3c0f6f1
type: Opaque
The local cluster in which you created your curated
secret should now be able to access the remote cluster. The operator
pod will aggregate all the values from the curated
secret and merge them into the dynamically generated general
secret. This general
secret is loaded by the Vector pod, giving it access to the remote cluster's credentials.
Updated 7 days ago