Run SQL queries
The hdxsql command runs a user-authored ClickHouse SQL statement against a configured Hydrolix cluster and streams the result rows back into the Splunk pipeline.
Availability
hdxsql is available in Hydrolix Search for Splunk version 1.5.0 and later.
Use hdxsql when hdxsearch's option-based shape doesn't fit the query: JOINs, CTEs (WITH), UNION/INTERSECT/EXCEPT, window functions, system tables, EXPLAIN, or ad-hoc administrative queries such as SELECT version().
Unlike hdxsearch, hdxsql doesn't compose SQL from option slots and doesn't add the Splunk time window. You write the complete statement, and the connector validates it, merges its own safety and telemetry SETTINGS, optionally caps the result size, and sends it.
hdxsql runs only read-only, SELECT-shaped queries and rejects write and administrative statements before they reach the cluster.
Syntax⚓︎
| hdxsql Syntax | |
|---|---|
Parameters⚓︎
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | A complete ClickHouse SELECT-shaped statement: SELECT, WITH … SELECT, a UNION/INTERSECT/EXCEPT of SELECTs, or an EXPLAIN of any of those. |
cluster |
string | No | The name of the Hydrolix cluster to query. Defaults to the configured default cluster. |
params |
JSON | No | A JSON object mapping parameter names to values, bound to ClickHouse parameter placeholders ({name:Type}) in query. Requires Hydrolix 6.0 and later. See Bind parameters. |
maxrows |
integer | No | Overall row cap. Applied both as a hdx_query_max_result_rows setting and as a LIMIT clamp on the query. 0 (or omitted) means no cap. |
timeout |
integer | No | Query execution time cap in seconds, bound to hdx_query_max_execution_time. 0 (or omitted) means no connector-applied timeout, so the Hydrolix-side default applies. |
fetchsize |
integer | No | Server-side emit chunk size, bound to ClickHouse's max_block_size setting. Affects memory use and time-to-first-row. 0 (or omitted) leaves the ClickHouse default in place. |
comment |
string | No | A human-readable comment forwarded as hdx_query_comment on the Hydrolix query. Useful for attributing dashboard or alert traffic in Hydrolix query telemetry. |
Map results to Splunk fields⚓︎
hdxsql doesn't rename or remap result columns: each result column becomes a Splunk field with exactly the column name or alias from the SELECT. Splunk-recognized fields such as _time, host, source, sourcetype, _raw, and _indextime are populated only when you alias a column to that name.
For _time in particular, the value must be Unix-epoch seconds, typically produced with toUnixTimestamp(<column>) AS _time.
| Alias Columns to Splunk Fields | |
|---|---|
Without an alias, the result has a column-named field but no Splunk-recognized _time, and _raw is never populated by serializing the row.
Compose a time window⚓︎
hdxsql doesn't add the Splunk time picker's earliest/latest bounds. Compose the time window using one of two patterns.
SPL subsearch with addinfo⚓︎
Using addinfo to bind the time picker doesn't depend on query parameters, so this pattern works regardless of the Hydrolix version. The substitution happens at SPL's command-parsing layer, so by the time hdxsql runs, the query argument is a literal string.
addinfo exposes info_min_time and info_max_time as floating-point epoch seconds, so use toDateTime(...) (which accepts Float64) rather than fromUnixTimestamp(...) (which accepts only integers):
| Bind the Time Picker with addinfo | |
|---|---|
Parameter placeholders⚓︎
On Hydrolix version 6.0 and later, reference {earliest_time:Int64} / {latest_time:Int64} placeholders in the SQL and bind them through params. This is the preferred way to pass any user-supplied scalar value. See Bind parameters.
Bind parameters with params⚓︎
The params option accepts a JSON object that binds values to ClickHouse parameter placeholders ({name:Type}) referenced in the SQL. Values are bound at the request layer and are never interpolated into the SQL string, which keeps user-supplied values safe from injection.
| Parameter Binding | |
|---|---|
params requires Hydrolix 6.0 and later
On clusters reporting a server_version below 6.0.0, a query that supplies params exits with a user-visible error. Older clusters should use the addinfo SPL-subsearch pattern instead.
Malformed JSON, or JSON whose top-level value isn't an object, is rejected before any request is sent. Type mismatches between a value and its declared placeholder Type, and mismatches between supplied parameter names and the placeholders in the SQL, surface as Hydrolix-side errors.
Connector-controlled SETTINGS⚓︎
When hdxsql sends a query, it merges a connector-owned SETTINGS bundle into the SETTINGS clause, if you supplied one. Non-reserved keys in the clause (for example, use_query_cache) pass through unchanged:
| User SETTINGS Pass Through | |
|---|---|
The following keys are reserved. A user-supplied value for any of them is dropped before the connector's value is appended, regardless of whether the matching command option was supplied:
| Reserved key | Source |
|---|---|
hdx_query_admin_comment |
Splunk-side attribution (always set) |
readonly |
Connector-controlled read-only flag (always set) |
hdx_query_max_memory_usage |
Connector-fixed at 2 GiB (always set) |
hdx_query_max_attempts |
Connector-fixed at 1 (always set) |
hdx_query_max_execution_time |
timeout option (only when supplied) |
hdx_query_max_result_rows |
maxrows option (only when supplied) |
max_block_size |
fetchsize option (only when supplied) |
hdx_query_comment isn't reserved, because it carries user attribution rather than a safety guarantee. An inline SETTINGS hdx_query_comment value and the comment option each pass through, and when both are set they're joined into one entry (hdx_query_comment='X; Y').
Example queries⚓︎
Every query that scans a table must bound the primary timestamp column so Hydrolix can prune partitions. An unbounded query scans the entire table. These examples query the akamai.logs table and use a relative WHERE reqTimeSec >= now() - INTERVAL 1 HOUR bound for brevity; to bind a query to the Splunk time picker instead, use the addinfo or params patterns. The SELECT version() query is the exception. It reads cluster metadata and touches no table.
-
Report the percentage of requests served from cache and the average time to first byte for each request host.
Cache Offload per Host -
Build a per-minute request volume time series for charting by aliasing the time bucket to
_time. -
Rank the request paths that serve the most traffic in each host using a CTE and a window function.
-
Query a named cluster and cap the result at 1,000 rows.
maxrowsapplies both aLIMITclamp and thehdx_query_max_result_rowsserver-side cap. This requires a cluster named SecondCluster defined in the Hydrolix Search app. -
Check connectivity and report the cluster's Hydrolix version.
Cluster Version
See also⚓︎
- Query - the
hdxsearchcommand for SPL-shaped log exploration - Explore the data catalog - the
hdxdescribecommand for listing projects, tables, and column schemas