SQL Statements

Hydrolix implements a significant part of the ClickHouse SQL query language, which is an ANSI-SQL Compliant. Hydrolix uses the ClickHouse SQL parser natively before creating an execution plan, therefore most operations should work on Hydrolix that work on ClickHouse, except where noted.

This documentation is meant to augment or clarify the original ClickHouse Documentation and note any differences in the Hydrolix implementation.

While most functions are called out, not all are. It's safe to assume that functions that aren't called out explicitly or by category are implemented and work as expected.

In Hydrolix, data is organized by project and table. In the bucket used when the system was setup, there is a path structure that's below the following prefix. Tables and projects are referenced within the cloud bucket via UUIDs.

bucket_name/db/hdx

Table Statements⚓︎

Two table commands are available in Hydrolix:

exists tablename
describe tablename

Required Statement Clauses⚓︎

The SELECT statement is currently the primary SQL statement supported by Hydrolix.

See Insert / Save Data Manually for examples of INSERT INTO.

The mandatory parts of SELECT statements in Hydrolix SQL are:

SELECT
FROM
WHERE

Each are described below.

SELECT⚓︎

SELECT statements work as expected. They allow you to choose which data you want from a data set. It can include column names and function calls.

FROM⚓︎

FROM specifies where the data should come from. In Hydrolix, that requires a:

Project Name
Table Name
View Schema Name (Optional)

The FROM clause ends up looking like:

SELECT
FROM project_name.table_name#my_view
WHERE

If a default view schema is defined on a table, it doesn't have to be specified as part of the query:

SELECT
FROM project_name.table_name
WHERE

WHERE⚓︎

WHERE clauses in Hydrolix are required, and must contain the primary index defined in the ingest transform schema. It can contain other conditions as well, but MUST contain a test on the primary index.

SELECT count(timestamp) AS count
  FROM my_first_project.the_table#the_view
  WHERE (timestamp BETWEEN ‘1977-04-25 00:00:00’ AND  ‘2010-04-25 23:00:00’)

Optional Statement Clauses⚓︎

WITH⚓︎

WITH statements allow you to create a named sub-query for use later in the select statement. They come before the SELECT statement so the result can be used in the rest of the query.

WITH [subquery] AS unique_name

HAVING⚓︎

HAVING is a statement where conditions are applied after an aggregation is complete. TheWHERE statement conditions are applied before an aggregation. HAVING statements are applied after.

GROUP BY⚓︎

The GROUP BY statement groups data together in rows. For example, you could query all incidents for a time period, grouping on incident type to get totals for each type of incident in a given time period.

ORDER BY⚓︎

ORDER BY is used to sort the result set. Takes values:

asc
desc

If neither is specified, asc is assumed.

IN⚓︎

Used to test if a value is in a set.

SELECT avg(score) 

FROM my_project.my_table WHERE "the_user_name" IN
['user1', 'user2', 'user3'] AND timestamp
BETWEEN....

would calculate the average of the score column for those three users.

LIMIT⚓︎

LIMIT the number of rows returned. It must be the last statement in SQL statement.

SELECT
FROM
WHERE
...
LIMIT 10