via Splunk with Hydrolix Search
Background
Use Hydrolix as a back-end datastore for your existing Splunk tables to take advantage of low-latency queries, long-term retention, and cost savings.
Hydrolix Search for Splunk can query raw data tables and summary tables for quick charting. It does this via a new hdxsearch
command for Splunk SPL, which has the following features:
- With minimal configuration, it queries your Hydrolix clusters.
- It automatically finds the primary timestamp for the specified table.
- It applies time range filtering from the Splunk UI.
- It can limit query results to protect the Splunk UI.
For simplicity, the hdxsearch
command offers a simple list of fields for SELECT statements. This limitation does not apply to WHERE clauses. If you need more flexibility in the SELECT portion of your queries, see our Splunk with DB Connect method which gives you full ClickHouse SQL capabilities.
Installation
In your Splunk Enterprise instance UI, inside the “Apps” menu, select “Find More Apps.”
Type “Hydrolix” into the search box and select the Hydrolix Search application to install it. Find it in the results on the right.
Select the "Install" button. The installation process may required you to submit your Splunk username and password.
Configuration
Cluster Credentials
Next, you’ll be directed to the Hydrolix Search setup form. This example of the form has two clusters configured.
Fill in the four fields above:
Field Name | Description | Example |
---|---|---|
Cluster Name | The name you’ll use to refer to this cluster in your Splunk Search Processing Language (SPL) queries | Demo |
Host:Port | The hostname (and optional port number) of your Hydrolix cluster | mycluster.mydomain.com |
Username | The username you’ve chosen to query your Hydrolix cluster | sampleuser |
Password | The password for the above user | sdjf^wer%!k |
If you’d like to use this cluster by default when using the hdxsearch
command, select “MAKE DEFAULT CLUSTER” on the right-hand side of the configuration line.
Multiple Clusters
If you’re going to be using more than one cluster or user account from this Splunk instance, add them to the list with the “OR ADD CLUSTER” option. Clicking the plus sign will open up a new row of configuration.
Save the Configuration
Once you’re done, select “Save Changes” and you’ll be automatically directed to the query screen of the Hydrolix Search for Splunk application.
Saving changes will overwrite all settings
When you save changes, these settings will replace all the clusters you have defined in this application's configuration.
Query
Quickstart Sample Query
Here's an example of the query screen prefilled with a query and results. You can select the image to see a full-screen version.
Here’s a simple example -- replace my.table
with the Hydrolix table of your own choosing:
| hdxsearch table=”my.table” fields=”*”
Even though the results of this query can be quite large, the time picker in the upper right-hand corner of the query interface and the built-in limit of 5,000 query results will act as guardrails to avoid returning too much data or using excessive compute resources.
Note that Hydrolix Search for Splunk doesn’t support Splunk’s “real-time” UI, so the time picker only provides “relative” options.
Query Parameters
As well as the required table
and fields
parameters, you can specify a WHERE clause, adjust the row limit, and adjust other settings as parameters to the hdxsearch
command:
Parameter Name | Required? | Description |
---|---|---|
table | Yes | The Hydrolix table you want to query |
fields | Yes | A comma-delimited list of the fields you want to retrieve from the table, or * , which returns all the fields |
cluster | No | The name of the Hydrolix cluster that contains the table you want to query |
where | No | An SQL WHERE statement to filter the results of your query |
limit | No | Overriding value for limiting the maximum number of rows retrieved from the table. The default is 5,000. If you specify 0 rows, it will attempt to return all rows available. |
raw | No | Specify the name of a field where you'd like the raw value to be sent to the “Event” column of the SPL query output |
Limit your fields
Since Hydrolix is a columnar data store, limiting the number of fields returned by the query is an effective way to speed up queries and reduce compute resources. Rather than using wildcards in your
fields
parameter, you should just retrieve the columns you need.
Example queries
| hdxsearch table=”my.table” fields=”*”
- Basic query returning all fields from
my.table
, limited by the Splunk UI’s time picker and the default 5,000 maximum row limit
- Basic query returning all fields from
| hdxsearch table=”my.table” fields=”reqHost, reqMethod”
- Returns just the
reqHost
andreqMethod
columns frommy.table
.
- Returns just the
| hdxsearch table=”my.table” fields=”reqHost, reqMethod” limit=0 where=“reqHost IN (‘my.hostname.com’) AND reqMethod=‘POST’”
- The same query as above, but bypasses the 5,000-row limit and only returns rows where the
reqHost
field ismy.hostname.com
and thereqMethod
is POST. The contents of thewhere
parameter are passed along to Hydrolix in an SQL WHERE clause.
- The same query as above, but bypasses the 5,000-row limit and only returns rows where the
| hdxsearch table=”my.table” fields=”reqHost, reqMethod” limit=0
| stats count by reqHost
- Aggregates aren’t supported by the simple SELECT statements available, so we depend on Splunk’s SPL to count the number of rows. Make sure you have
limit
set to something high enough to capture all of your data so it can be aggregated.
- Aggregates aren’t supported by the simple SELECT statements available, so we depend on Splunk’s SPL to count the number of rows. Make sure you have
| hdxsearch table=”my.table” fields=”reqHost, reqMethod, UA” raw=”UA”
- This query will output the raw value of the
UA
field into the “Event” column of the SQL query output.
- This query will output the raw value of the
| hdxsearch table=”my.table” fields=”reqHost, reqMethod” cluster=”SecondCluster”
- This query requires that you have a cluster set up with the name
SecondCluster
. Rather than querying the default cluster, the query will be run on the named cluster.
- This query requires that you have a cluster set up with the name
Troubleshooting
- “Invalid Argument” messages when making queries.
- This usually means that your table name or field name(s) don’t exist. Doublecheck your table and field names.
Support
If the troubleshooting step(s) above don’t help you, contact Hydrolix support at [email protected].
Updated 11 days ago