Hydrolix Search for Splunk
Overview
Use Hydrolix as a back-end datastore for your existing Splunk tables to take advantage of low-latency queries, long-term retention, and cost savings.
Hydrolix Search for Splunk can query raw data tables and summary tables for quick charting. It does this via a new hdxsearch
command for Splunk SPL, which has the following features:
- With minimal configuration, it queries your Hydrolix clusters.
- It automatically finds the primary timestamp for the specified table.
- It applies time range filtering from the Splunk UI.
- It can limit query results to protect the Splunk UI.
For simplicity, the hdxsearch
command offers a simple list of fields for SELECT statements. This limitation does not apply to WHERE clauses. If you need more flexibility in the SELECT portion of your queries, see our Splunk with DB Connect method which gives you full ClickHouse SQL capabilities.
Installation
In the Splunk Enterprise UI, inside the Apps menu, select Find More Apps.

Type Hydrolix into the search box and select the Hydrolix Search application from the results on the right.

Click the Install button. The installation process may require logging in to Splunkbase with your Splunk username and password.
Configuration
Cluster Credentials
On installation, you will be directed to the Hydrolix Search setup form. This example of the form has two clusters configured.

Fill in the configuration fields:
Field Name | Description | Example |
---|---|---|
Cluster Name | The name used to refer to this cluster in your Splunk Search Processing Language (SPL) queries | Demo |
Host:Port | The hostname (and optional port number) of your Hydrolix cluster | mycluster.mydomain.com:8088 |
Username | The username you've chosen to query your Hydrolix cluster | sampleuser |
Password | The password for the above user | sdjf^wer%!k |
Default result count limit | The maximum number of records to retrieve per query (unless overridden by the query) | 5000 |
Splunk queries use port 8088 on Hydrolix clusters
Hydrolix Search for Splunk uses the ClickHouse HTTP query interface to the cluster on tcp/8088.
Select a default cluster to run queries against by clicking the MAKE DEFAULT CLUSTER bubble on the right-hand side of the configuration line for that cluster.
Multiple Clusters
If you will be using more than one cluster or user account from this Splunk instance, add them to the list with the OR ADD CLUSTER option. Clicking the plus sign will open up a new row of configuration.
Save the Configuration
Once the configuration is done, select Save Changes to apply changes. Your new cluster settings will replace any previously-saved settings, and you will be automatically directed to the query screen of the Hydrolix Search for Splunk application.
Saving changes will overwrite all settings
Reconfigure an existing cluster
To reconfigure Hydrolix Search for Splunk, inside the Apps menu, select Manage Apps. Type Hydrolix into the search box and under Actions, click Set up. This will present the same cluster configuration screen as on initial installation, populated with the currently-configured clusters.
Query
The following are some example queries along with parameters and query settings which can be used to customize Splunk queries to Hydrolix.
Quickstart Sample Query
Here's an example of the query screen showing a query and results. You can select the image to see a full-screen version.

The following is a simple example. Replace my_project.my_table
with the Hydrolix project and table of your own choosing:
| hdxsearch table="my_project.my_table" fields="*"
Even though the results of this query can be quite large, the time picker in the upper right-hand corner of the query interface and the cluster's default limits will act as guardrails to avoid returning too much data or using excessive compute resources.
Note that Hydrolix Search for Splunk does not support Splunk's real-time UI, so the time picker only provides relative options.
Query Parameters
As well as the required table
and fields
parameters, you can specify a WHERE clause, adjust the row limit, and adjust other settings as parameters to the hdxsearch
command:
Parameter Name | Type | Required | Description |
---|---|---|---|
table | string (fieldname) | Yes | The Hydrolix table to query in the form project.table . |
fields | list of strings | Conditional | A comma-delimited list of fields to retrieve from the table, or * , which returns all the fields. Either fields or raw must be specified. |
raw | string (fieldname) | Conditional | The name of a field whose raw value should be sent to the "Event" column of the SPL query output. Either fields or raw must be specified. |
where | string | No | A SQL WHERE statement to filter the results of the query. Defaults to no filter. |
time | string (fieldname) | No | The name of a field in table to treat as the event timestamp. Defaults to the primary key of the table. |
limit | integer | No | Maximum number of rows to retrieve from the table or 0 to retrieve all rows. Defaults to the limit value configured for the cluster being queried. |
cluster | string (fieldname) | No | The name of the Hydrolix cluster to query. Defaults to the configured default cluster. |
nocache | boolean | No | If set to true , query results will be excluded from caching. Defaults to false to take advantage of caching by using Hydrolix query caching. |
Performance Note: Limiting Fields
Because Hydrolix is a columnar data store, the number of fields returned by the query should be limited to accelerate execution and reduce compute resources. Instead of using wildcards in the fields parameter, specify only the required columns.
Example queries
-
Return all fields from
my_project.my_table
, limited by the Splunk UI's time picker and the default 5,000 maximum row limit| hdxsearch table="my_project.my_table" fields="*"
-
Return the
reqHost
andreqMethod
columns frommy_project.my_table
.| hdxsearch table="my_project.my_table" fields="reqHost, reqMethod"
-
Bypass the 5,000-row limit and return rows where the
reqHost
field ismy.hostname.com
and thereqMethod
isPOST
. The contents of the where parameter are passed along to Hydrolix in an SQLWHERE
clause.| hdxsearch table="my_project.my_table" fields="reqHost, reqMethod" limit=0 where="reqHost IN ('my.hostname.com') AND reqMethod='POST'"
-
Aggregates aren't supported by the simple
SELECT
statements available, so we depend on Splunk's SPL tocount
the number of rows. Uselimit=0
to request all data be included in the aggregation.| hdxsearch table="my_project.my_table" fields="reqHost, reqMethod" limit=0 | stats count by reqHost
-
Output the raw value of the
UA
field into the Event column of the SQL query result.| hdxsearch table="my_project.my_table" fields="reqHost, reqMethod, UA" raw="UA"
-
Run a query on a named cluster. This requires a defined cluster in the Hydrolix Search App with the name SecondCluster..
| hdxsearch table="my_project.my_table" fields="reqHost, reqMethod" cluster="SecondCluster"
Default query settings
Query settings in Hydrolix clusters implement configurable limits to protect cluster resources and the applications receiving results.
Query options
Hydrolix Search for Splunk uses the following default values for circuit breaker query options and query caching with each query.
Circuit breaker | Value | Type |
---|---|---|
hdx_query_max_execution_time | 60 (seconds) | Circuit breaker |
hdx_query_max_attempts | 3 | Circuit breaker |
hdx_query_max_result_rows | 100000 | Circuit Breaker |
use_query_cache | true | Performance |
You can use the SQL SETTINGS clause when executing queries to alter the default query options or use additional options.
LIMIT clause
Hydrolix Search for Splunk attaches a LIMIT 5000
clause to each query by default which limits a result set to 5000 rows. This limit can be adjusted when setting up the Splunk configuration by setting the Default result count limit
or in the search console.
Troubleshooting
- “Invalid Argument" messages when making queries.
- This usually means that your table name or field name(s) don’t exist. Doublecheck your table and field names.
v1.0.3
Added
- Added support for Hydrolix query caching. By default, each query runs with the setting
use_query_cache = TRUE
. To
run a query without caching, use the optionnocache=true
. - Added UI fields validation for HDX cluster and user credential inputs.
- Added support for Gzip HTTP response compression to reduce payload size.
- Added CI/CD pipeline for automated testing and deployment.
- Introduced automated app packaging validation using
splunk-appinspect
, integrated into CI for the Splunk App. - Added an input on setup page to allow users to set default result count limit.
- Added support for making cluster queries via proxy.
- Timestamp primary key is now auto-detected for all supported tables.
Changed
- Improved performance: changed wire format for data-retrieving queries to
JSONCompact
. Based on FastFormats
benchmarks, this results in a 3.09× reduction in serialized data size and a 1.65× decrease in ClickHouse CPU
usage. - Improved persistence of cluster configuration to retain settings across app lifecycle changes.
- Default time field resolution is now deterministic:
- First, it selects a column with
is_in_primary_key
= 1 and typeDateTime
. (Note:is_in_primary_key
may not
always
be set in ClickHouse.) - If none is found, it falls back to a column named
timestamp
with typeDateTime
. - If neither is found, an error is raised.
- First, it selects a column with
- Replaced
DESCRIBE TABLE
with a query tosystem.columns
for retrieving table column information. - Improved error handling and messages for invalid queries to provide clearer feedback and a more reliable user
experience.
Fixed
- UI now displays pre-existing configuration for easier review and updates.
Updated about 16 hours ago