SQL Statements
Hydrolix SQL Statements
In Hydrolix, data is organized by project and table. In the bucket used when the system was setup, there is a path structure that looks
bucket_name/db/hdx
Under hdx
are project folders. In any initial deployment, the project default
should exist. Under default
there should be one table called default_table
. If you are ever unsure about the names of projects or tables, you can browse the
bucket structure to find the names.
Table Statements
Two table commands are available in Hydrolix,
exists tablename
describe tablename
Required Statement Clauses
SELECT is currently the only type of statement supported by Hydrolix
The mandatory parts of statements in Hydrolix SQL are:
SELECT
FROM
WHERE
Each are described below.
SELECT
SELECT
statements work as expected. They allow you to choose which data you want from a data set. It can include column names and function calls.
FROM
FROM
specifies where the data should come from. In Hydrolix, that requires a :
- Project Name
- Table Name
- View Schema Name (Optional)
The FROM
clause ends up looking like:
SELECT
FROM project_name.table_name#my_view
WHERE
If a default view schema is defined on a table, it does not have to be specified as part of the query:
SELECT
FROM project_name.table_name
WHERE
WHERE
WHERE
clauses in Hydrolix are required, and must contain the primary index defined in the ingest transform schema. It can contain other conditions as well, but MUST contain a test on the primary index.
SELECT count(timestamp) AS count
FROM my_first_project.the_table#the_view
WHERE (timestamp BETWEEN ‘1977-04-25 00:00:00’ AND ‘2010-04-25 23:00:00’)
Optional Statement Clauses
WITH
With statements allow you to create a named sub-query for use later in the select statement. They come before the SELECT
statement so the result can be used in the rest of the query.
WITH [subquery] AS unique_name
...
HAVING
HAVING
is a statement where conditions areapplied after an aggregation is complete. TheWHERE
statement conditions are applied before an aggregation. HAVING
statements are applie after.
GROUP BY
The GROUP BY
statement groups data together inrows. For example, you could query all incidents for a time period, grouping on incident type to get totals for each type of incident in a given
time period.
ORDER BY
ORDER BY
is used to sort the result set. Takes values:
asc
desc
If neither is specified, asc
is assumed.
IN
Used to test if a value is in a set.
SELECT avg(score)
FROM my_project.my_table WHERE "the_user_name" IN
['user1', 'user2', 'user3'] AND timestamp
BETWEEN....
would calculate the average of the score column for those 3 users.
LIMIT
LIMIT
the number of rows returned. It must be the last statement in SQL statement.
SELECT
FROM
WHERE
...
LIMIT 10
Updated 10 months ago