Projects & Tables

Hydrolix stores data within tables. You can group tables together in logical namespaces called projects. To reference your data, use the full path project.table, for example monitoring.http_logs.

Projects

Projects are equivalent to databases in a traditional RDBMS. You can create any number of projects, as long as each name is unique. We recommend picking short lower case names. When you must use a longer name, break it up logically with underscores to improve readability and decrease the likelihood of typos.

For example, an organization could contain three different projects:

Multiple projects in an organization

"Systems Monitoring"
"Stock Trading"
"IOT"

These projects could contain 3 independent sets of unrelated data. They can all coexist in the same deployment.

You can manage projects via the API or the Portal UI.

Creating a Project via API

You must authenticate to use the API.

Login with your username/password.
Create a Project providing a name and description.

The following code snippets demonstrate an example of a request made to the create project API and the corresponding response after Hydrolix successfully creates a project:

{
  "name": "monitoring",
  "description": "Global monitoring of web services"
}

{
    "uuid": "dfadb1a9-c2ec-4e3e-aab6-1117c5532843",
    "name": "monitoring",
    "description": "Global monitoring of web services",
    ...
}

The response contains the uuid of the created project. To references resources contained within a project, like tables and transforms, include the project uuid path parameter in your request made to those API endpoints.

Project settings

The settings object specifies project-level configuration, describing default options such as default configuration for queries against data stored in the project and rate limiting.

Property	Type	Purpose	Default	Required
`default_query_options`	object	See Query Options Reference for descriptions of each option.	See Query Options Reference	No
`blob`	object	Don't use this API option.	`null`	No
`rate_limit`	object	Limits bytes per second ingest rate and max payload size. See Rate Limits.	no limit	No

Tables

A table (and the associated Write Transforms ) represents your data-set. Hydrolix will store it as a compressed, sorted, two-dimensional data structure in a number of .hdx files in cloud storage (AWS/GCP). It's referenced via the project and you can have many tables in the same project.

The Table API endpoint allows you to define a name for your data, along with:

controls for stream ingest - hot/cold data parameters
controls for autoingest - patterns and queues to read for notifications
enable/disable background merge functionality to optimize your data storage
TTL and removal of old data

You have full control of how data is ingested into a table, backed by sane defaults if you choose not to modify for the likes of streaming ingest.

Before you can ingest data you will need to define a transform (a write schema) for a table, describing the data types to use.

👍
Advanced concept: Tables are flexible by design!
One table may have multiple ingest transforms, essentially expanding the column width of a table.

multi transforms on a single table must share the same datetime column

the resulting table column width is a union of all transforms

ideal for very closely associated data-sets arriving from different ingest methods

The sample project used in tutorials include a variety of tables. Each table has several columns. The table metrics has columns timestamp, hostname, region, etc...

Sample Table

You can manage tables via REST API or the Web UI.

Create a Table via API

You will need to be authenticated to use the API.

Get the bearer token, which is good for the next 24 hours, to authenticate future API calls. This command assumes you've set the $HDX_HOSTNAME, $HDX_USER and $HDX_PASSWORD environment variables:

export HDX_TOKEN=$(
  curl -v -X POST -H "Content-Type: application/json" \
  https://$HDX_HOSTNAME/config/v1/login/ \
  -d "{
    \"username\":\"$HDX_USER\",
    \"password\":\"$HDX_PASSWORD\"  
  }" | jq -r ".auth_token.access_token"
)

Create Table providing a name and ingest settings (optional):

An example create table API request/response exchange:

{
  "name": "http_logs",
  "description" : "web logs"
}

{
    "project": "6b0692f9-c040-47b1-988a-582e57dd3631",
    "name": "http_logs",
    "description": "web_logs",
    "uuid": "94dba0fa-24f6-4962-9190-e47ead444ec4",
    "created": "2022-05-31T03:48:55.172580Z",
    "modified": "2022-05-31T03:48:55.172599Z",
    "settings": {
        "stream": {
            "hot_data_max_age_minutes": 3,
            "hot_data_max_active_partitions": 3,
            "hot_data_max_rows_per_partition": 12288000,
            "hot_data_max_minutes_per_partition": 1,
            "hot_data_max_open_seconds": 60,
            "hot_data_max_idle_seconds": 30,
            "cold_data_max_age_days": 3650,
            "cold_data_max_active_partitions": 50,
            "cold_data_max_rows_per_partition": 12288000,
            "cold_data_max_minutes_per_partition": 60,
            "cold_data_max_open_seconds": 300,
            "cold_data_max_idle_seconds": 60
        },
        "age": {
            "max_age_days": 0
        },
        "reaper": {
            "max_age_days": 1
        },
        "merge": {
            "enabled": true,
            "partition_duration_minutes": 60,
            "input_aggregation": 20000000000,
            "max_candidates": 20,
            "max_rows": 10000000,
            "max_partitions_per_candidate": 100,
            "min_age_mins": 1,
            "max_age_mins": 10080
        },
        "autoingest": {
            "enabled": false,
            "source": "",
            "pattern": "",
            "max_rows_per_partition": 12288000,
            "max_minutes_per_partition": 60,
            "max_active_partitions": 50,
            "input_aggregation": 1073741824,
            "dry_run": false
        },
        "sort_keys": [],
        "shard_key": null,
        "max_future_days": 0
    },
    "url": "https://my-domain.hydrolix.live/config/v1/orgs/0ffa6312-61ba-4620-8d57-96514a7f3859/projects/6b0692f9-c040-47b1-988a-582e57dd3631/tables/94dba0fa-24f6-4962-9190-e47ead444ec4"
}

The response contains the uuid of the created table. All resources contained within a table (like transforms) are referenced via the project uuid path parameter and table uuid in their API endpoints. Therefore, you will need to store the table uuid.

Table settings

The settings object specifies table-level configuration, describing default options at data storage and query time such as default query options, rate limits, shard keys, and other behaviors.

Property	Type	Purpose	Default	Required
`default_query_options`	object	Set query options for this table. See Query Options Precedence and Query Options Reference.	See Query Options Reference	No
`rate_limit`	object	Limits bytes per second ingest rate and max payload size. See Rate Limits.	no limit	No
`summary`	object	Set this option if you want to create a summary table.	`null`	No
`stream`	object	Set this option to configure stream ingest options for the table.	`none`	No
`age`	object	Use this setting to configure a TTL after which data will be deactivated.	See Data Lifecycle Management	No
`reaper`	object	Use this setting to configure a TTL after which data will be deleted.	See Data Lifecycle Management	No
`merge`	object	Enable/disable merge and configure the merge pools. See the Merge Pools documentation.	`"enabled": true`, all other nested options default to `null`	No
`autoingest`	array[object]	Enable and configure a continuous, batch ingest task for this table. See also Batch Ingest.	`"enabled": false`	No
`sort_keys`	array[string]	Change the sort order of data as it's ingested and stored. See also Table Settings Reference.	`null`, Hydrolix sorts columns according to cardinality.	No
`shard_key`	string	Shard based on a specified key rather than the default, time-based. See also Table Settings Reference.	`null`, results in time-based sharding	No
`max_future_days`	integer	Retain rows with a timestamp less than this configured value of unit days. See also Table Settings Reference.	`0`	No
`max_request_bytes`	integer	Maximum allowed request size in bytes as measured by the content length of the request.	`0`, specifies no configured maximum	No
`storage_map`	object	Assigns a default storage bucket to a table. See also Table Settings Reference.	`turbine`	No

Projects

Creating a Project via API

Project settings

Tables

👍Advanced concept: Tables are flexible by design!

Create a Table via API

Table settings

👍
Advanced concept: Tables are flexible by design!