Fluent Bit Integration
Fluent Bit is an open source telemetry agent which can collect, process, and forward metric, log, and trace telemetry data from a wide range of environments into Hydrolix. Fluent Bit integrates with existing ecosystems such as Prometheus and OpenTelemetry and is designed with minimal performance impact in mind. You can use it as a simple edge agent for a single deployment or within a more complex environment as a central collector and decorator for telemetry data from varying sources and environments.
For more details, see the Fluent Bit documentation.
Before you begin
You will need a running Hydrolix deployment. Follow the instructions for your preferred cloud vendor if you have yet to deploy Hydrolix. From your running Hydrolix cluster, you will need the following information:
Item | Description | Example value | How to obtain this information |
---|---|---|---|
Org ID | This is the ID of your Hydrolix organization. | ID: bc3f041b-ea52-45e1-b9d5-41819c465854 | You can determine what orgs exist within your running Hydrolix cluster using the Hydrolix cluster API. The org ID you use should correspond to the table in which you want to store Fluent Bit data. |
Project name and ID | This is a logical namespace for your table below. You will need the name of the project corresponding to the table in which you want to store Fluent Bit data. | Name: fluentbit_project ID: c2445da3-ec63-42be-9f12-f104f9656f4c | Follow these instructions to create a project. |
Table name and ID | This is the destination to which you will route data from your Fluent Bit instance. You will need the name of the table you want to store Fluent Bit data in. | Name: fluentbit_table ID: 798dfcf4-7560-4b24-1337-29017619baa6 | Follow these instructions to create a table. |
OAuth bearer token | Use an OAuth Bearer token for Fluent Bit to authenticate with the Hydrolix Streaming Ingest API. | eyXrxkzoN2fRiiKpnV... | Follow these instructions to generate an OAuth bearer token. |
Getting started
There are four major steps covered within this document. Following these steps will result in an integrated setup between Fluent Bit and a running Hydrolix cluster:
- Deploy Fluent Bit
- Create a Hydrolix Transform for your Fluent Bit data
- Configure Fluent Bit to send data to Hydrolix
- Verify that data is flowing through Fluent Bit into your Hydrolix cluster
Example files
Throughout this document, there are example files for Fluent Bit configuration and a Hydrolix transform. If you are standing up a proof-of-concept with Fluent Bit and Hydrolix, you can use these files to generate and send data from Fluent Bit to Hydrolix. The example files use the following Fluent Bit inputs, filters, and outputs:
Inputs
Filters
Note that the Nest filter is used to nest the input records within a map key. This ensures Hydrolix will store each of the four input record types within four distinct columns in a single table.
Outputs
Fluent Bit must use the Hydrolix HTTP Stream API endpoint as the protocol and ingest endpoint for consuming data into your Hydrolix cluster.
Deploy Fluent Bit
There are multiple deployment methods for Fluent Bit described in Getting Started with Fluent Bit.
If you are getting a first time or proof-of-concept deployment started with Fluent Bit and Hydrolix, the Docker deployment is quick to get started. Additionally, it's recommended you deploy using the latest debug Docker image as the standard Fluent Bit images are Distroless and don't include a shell or package manager. The debug images, however, do. You can deploy the latest debug image with:
docker pull fluent/fluent-bit:latest-debug
followed by
docker run -ti cr.fluentbit.io/fluent/fluent-bit:latest-debug
Use the Fluent Bit release images for production deployments.
Create a Hydrolix transform for the Fluent Bit data
You will need to create a Hydrolix transform which determines how your Fluent Bit data will be mapped onto your Hydrolix table. After creating it, you will specify the transform name within the Fluent Bit configuration file.
Reference these instructions for creating and publishing a transform. The following example transform will map incoming Fluent Bit data from the example inputs. The transform also uses the Hydrolix auto values feature to generate an ingest timestamp (hdx_ingest_timestamp
). This timestamp acts as the primary column for the table in which Fluent Bit data will be ingested.
A more useful primary timestamp might be an event timestamp decorated onto the telemetry data by Fluent Bit before it is forwarded to your Hydrolix cluster. Additionally, you could decorate the outgoing Fluent Bit data with the originating hostname. These two use-cases are not covered within this document.
Creating transforms can be easier in the UI rather than via the API
If you create your transform in your Hydrolix cluster UI, you don't need to know the org, project, or table IDs. However, you will still need to supply the transform output columns.
[
{
"name": "fluentbit_transform",
"description": "",
"settings": {
"is_default": true,
"rate_limit": null,
"sql_transform": null,
"null_values": [],
"sample_data": null,
"output_columns": [
{
"name": "hdx_ingest_timestamp",
"datatype": {
"type": "datetime",
"index": false,
"primary": true,
"format": "2006-01-02T15:04:05.999999Z",
"resolution": "seconds",
"default": null,
"script": null,
"source": {
"from_automatic_value": "current_time"
},
"suppress": false
}
},
{
"name": "account_id",
"datatype": {
"type": "string",
"index": true,
"format": null,
"resolution": "seconds",
"default": null,
"script": null,
"source": null,
"suppress": false
}
},
{
"name": "mem_stats",
"datatype": {
"type": "map",
"index": true,
"format": null,
"resolution": "seconds",
"default": null,
"script": null,
"elements": [
{
"type": "string",
"index": true
},
{
"type": "uint32",
"index": true
}
],
"source": null,
"suppress": false
}
},
{
"name": "swap_stats",
"datatype": {
"type": "map",
"index": true,
"format": null,
"resolution": "seconds",
"default": null,
"script": null,
"elements": [
{
"type": "string",
"index": true
},
{
"type": "uint32",
"index": true
}
],
"source": null,
"suppress": false
}
},
{
"name": "cpu_stats",
"datatype": {
"type": "map",
"index": false,
"format": null,
"resolution": "seconds",
"default": null,
"script": null,
"elements": [
{
"type": "string",
"index": true
},
{
"type": "double",
"index": false
}
],
"source": null,
"suppress": false
}
},
{
"name": "net_stats",
"datatype": {
"type": "map",
"index": false,
"format": null,
"resolution": "seconds",
"default": null,
"script": null,
"elements": [
{
"type": "string",
"index": true
},
{
"type": "double",
"index": false
}
],
"source": null,
"suppress": false
}
},
{
"name": "disk_stats",
"datatype": {
"type": "map",
"index": false,
"format": null,
"resolution": "seconds",
"default": null,
"script": null,
"elements": [
{
"type": "string",
"index": true
},
{
"type": "double",
"index": false
}
],
"source": null,
"suppress": false
}
}
],
"compression": "",
"wurfl": null,
"format_details": {
"flattening": {
"depth": null,
"active": false,
"map_flattening_strategy": {
"left": "",
"right": ""
},
"slice_flattening_strategy": {
"left": "",
"right": ""
}
}
}
},
"url": "https://{hdx-host}.hydrolix.live/config/v1/orgs/bc3f041b-ea52-45e1-b9d5-41819c465854/projects/c2445da3-ec63-42be-9f12-f104f9656f4c/tables/798dfcf4-7560-4b24-1337-29017619baa6/transforms/",
"type": "json"
}
]
Configure Fluent Bit to send data to Hydrolix
Fluent Bit support two formats for configuration files: YAML (fluent-bit.yaml
) and classic mode (fluent-bit.conf
). You can read more about the two formats in Configuring Fluent Bit. The default configuration file format and location provided in the docker images is fluent-bit/etc/fluent-bit.conf
. However, below is example configuration in both formats. The following example configuration files use the aforementioned Fluent Bit inputs, filters, and outputs and have been tested using Fluent Bit version 3.2.7.
Fluent Bit Configuration Changes Require a Restart
Don't forget to restart Fluent Bit after making changes to the configuration file. Fluent Bit doesn't dynamically read in configuration changes.
[INPUT]
name cpu
tag cpu
interval_sec 1
[INPUT]
name mem
tag mem
interval_sec 1
[INPUT]
name netif
tag netif
interval_sec 1
interface ens5
[INPUT]
name disk
tag disk
interval_sec 1
[FILTER]
Name nest
Match mem
Operation nest
Wildcard Mem.*
Nest_under mem_stats
Remove_prefix Mem.
[FILTER]
Name nest
Match mem
Operation nest
Wildcard Swap.*
Nest_under swap_stats
Remove_prefix Swap.
[FILTER]
Name nest
Match cpu
Operation nest
Wildcard *
Nest_under cpu_stats
[FILTER]
Name nest
Match netif
Operation nest
Wildcard *
Nest_under net_stats
[FILTER]
Name nest
Match disk
Operation nest
Wildcard *
Nest_under disk_stats
[OUTPUT]
name stdout
match *
[OUTPUT]
Name http
Match *
Host {hdx-host}.hydrolix.live
uri /ingest/event
Port 443
tls on
header x-hdx-table fluentbit_project.fluentbit_table
header x-hdx-transform fluentbit_transform
header Authorization Bearer eyJhbGciOiJSUzI1N...
service:
http_server: "on"
Health_Check: "on"
log_level: info
pipeline:
inputs:
- name: cpu
tag: cpu
interval_sec: 1
- name: mem
tag: mem
interval_sec: 1
- name: disk
tag: disk
interval_sec: 1
- name: netif
tag: netif
interval_sec: 1
interface: ens
filters:
- name: nest
match: '*'
operation: nest
wildcard: Mem.*
nest_under: mem_stats
remove_prefix: Mem.5
- name: nest
match: mem
operation: nest
wildcard: Swap.*
nest_under: swap_stats
remove_prefix: Swap.
Name nest
- name: nest
match: cpu
operation: nest
wildcard: *
nest_under: cpu_stats
- name: nest
match: netif
operation: nest
wildcard: *
nest_under: net_stats
- name: nest
match: disk
operation: nest
wildcard: *
nest_under: disk_stats
outputs:
- name: stdout
match: '*'
- name: http
match: '*'
host: {hdx-host}.hydrolix.live
port: 443
URI: /ingest/event
tls: on
header: x-hdx-table fluentbit_project.fluentbit_table
header: x-hdx-transform fluentbit_transform
header: Authorization Bearer eyJhbGciOiJSUzI1NiIsIn...
Verify data from Fluent Bit is in your Hydrolix cluster
If your Fluent Bit deployment is successfully sending data to your Hydrolix cluster, you should see a line similar to the following in your Fluent Bit logs:
2025-02-25 11:53:25 [2025/02/25 19:53:25] [ info] [output:http:http.1] {hdx-host}.hydrolix.live:443, HTTP status=200
2025-02-25 11:53:25 {"code":200,"message":"success"}
Once Fluent Bit data is being successfully stored in your Hydrolix cluster, you can query the data using the UI, API, or via other query interfaces specified in Query Data. You can start with a query like the following to view your Fluent Bit data:
select *
from fluentbit_project.fluentbit_table
limit 10
Troubleshooting
Debug logging: Fluent Bit
For more log detail from Fluent Bit, you can enable debug logging with the following configuration:
[SERVICE]
log_level debug
service:
log_level: info
Debug logging: Hydrolix
Within your hydrolixcluster.yaml
Kubernetes configuration, change the following value to enable debug logging for all Hydrolix service components:
spec:
log_level:
'*': debug
You can read more about Hydrolix logging configuration here.
Querying Hydrolix cluster returns zero results
If querying your Hydrolix cluster for Fluent Bit data returns zero results, try the following investigative and troubleshooting steps:
High ingest latency
Within your Hydrolix cluster UI, navigate to the Data
tab within the left-hand navigation bar. Then select your project and table name. If you observe high Ingest Latency
such as the following:

this along with the Total Size
of 0 rows
both indicate that no data has made it into the table. An unusually high ingest latency indicates that data isn't making it from Fluent Bit into your Hydrolix cluster. The following troubleshooting steps will help identify the causes.
Authentication error
If you observe the following in your Fluent Bit logs:
2025-02-25 11:41:12 [2025/02/25 19:41:12] [error] [output:http:http.1] {hdx-host}.hydrolix.live:443, HTTP status=401
2025-02-25 11:41:12 401: Unauthorized
2025-02-25 11:41:12 [2025/02/25 19:41:12] [ warn] [output:http:http.1] could not flush records to {hdx-host}.hydrolix.live:443 (http_do=0), chunk will not be retried
this indicates an authentication error with your Hydrolix cluster. The Hydrolix HTTP streaming ingest API uses an OAuth Bearer token for authentication. You can retrieve a bearer token using these instructions.
Missing project, table, or transform
If you observe any of the following in your Fluent Bit logs:
2025-02-25 11:48:03 [2025/02/25 19:48:03] [error] [output:http:http.1] {hdx-host}.hydrolix.live:443, HTTP status=400
2025-02-25 11:48:03 {"code":400,"message":"no project ‘nonexistent_project’ found"}
2025-02-25 11:48:03 [2025/02/25 19:48:03] [ warn] [output:http:http.1] could not flush records to {hdx-host}.hydrolix.live:443 (http_do=0), chunk will not be retried
2025-02-25 11:47:23 [2025/02/25 19:47:23] [error] [output:http:http.1] {hdx-host}.hydrolix.live:443, HTTP status=400
2025-02-25 11:47:23 {"code":400,"message":"no table ‘nonexistent_table’ found"}
2025-02-25 11:47:23 [2025/02/25 19:47:23] [ warn] [output:http:http.1] could not flush records to {hdx-host}.hydrolix.live:443 (http_do=0), chunk will not be retried
2025-02-25 11:45:15 [2025/02/25 19:45:15] [error] [output:http:http.1] {hdx-host}.hydrolix.live:443, HTTP status=400
2025-02-25 11:45:15 {"code":400,"message":"unknown transform ‘nonexistent_transform’"}
2025-02-25 11:45:15 [2025/02/25 19:45:15] [ warn] [output:http:http.1] could not flush records to {hdx-host}.hydrolix.live:443 (http_do=0), chunk will not be retried
this indicates that the project, table, or transform specified in your Fluent Bit configuration doesn't exist. Make sure your Fluent Bit configuration references an existing project and table in your Hydrolix cluster. Additionally, verify that you have published a transform to your Hydrolix cluster that's compatible with your Fluent Bit data.
Invalid transform SQL
If you use K9s or kubectl to access your running Hydrolix cluster, navigate to one of your intake-head
pods (e.g intake-head-5cf9dk247-idkok
) and view the logs for the turbine
container. If you observe a log line similar to the following:
│ 2025-02-25T20:22:40.573208471Z {"timestamp": "2025-02-25T20:22:40.572+00:00", "component": "query_executor", "level":"error", "message":"{\"bytes_read\":0,\"dict_used\":[],\"exception\":\"Code: 47. DB::Exception: Missing columns: 'some_int32 │
│ ' 'some_other_int32' 'primary' while processing query: 'SELECT primary, 10 * some_int32 AS some_int32, some_other_int32 FROM file('7aaaaec2-ee43-43db-9541-950e5195961c/3a1bc171-ac68-47f2-bfb2-4ce193c654a7/42bc986dc5eec4d3/input.4291266653.js │
│ on', 'JSONCompactEachRow', '`hdx_ingest_timestamp` DateTime,`account_id` Nullable(String),`mem_stats` Map(String,Nullable(UInt32)),`swap_stats` Map(String,Nullable(UInt32)),`cpu_stats` Map(String,Nullable(Float64)),`net_stats` Map(String,Nul │
│ lable(Float64)),`disk_stats` Map(String,Nullable(Float64))')', required columns: 'primary' 'some_other_int32' 'some_int32'. (UNKNOWN_IDENTIFIER) (version 23.8.10.1)\",\"exception_code\":47...
Or a line like the following in the intake-head
container within the same pod:
2025-02-25T20:22:20.574400897Z {"error":"sink error: Code: 47. DB::Exception: Missing columns: 'some_int32' 'some_other_int32' 'primary' while processing query: 'SELECT primary, 10 * some_int32 AS some_int32, some_other_int32 FROM file('7aaa │
│ aec2-ee43-43db-9541-950e5195961c/3a1bc171-ac68-47f2-bfb2-4ce193c654a7/42bc986dc5eec4d3/input.650434658.json', 'JSONCompactEachRow', '`hdx_ingest_timestamp` DateTime,`account_id` Nullable(String),`mem_stats` Map(String,Nullable(UInt32)),`swap │
│ _stats` Map(String,Nullable(UInt32)),`cpu_stats` Map(String,Nullable(Float64)),`net_stats` Map(String,Nullable(Float64)),`disk_stats` Map(String,Nullable(Float64))')', required columns: 'primary' 'some_other_int32' 'some_int32'. (UNKNOWN_IDE │
│ NTIFIER)","file":"hdx_sink.go:447","level":"error","message":"Got error","timestamp":"2025-02-25T20:22:20.574+00:00"}
These both indicate a problem with the Transform SQL
configured with the Hydrolix transform. You should verify that the Transform SQL references existing columns, or you can try removing it. Once the Transform SQL is either valid or removed, you should no longer see these error messages.
Updated 1 day ago