Projects and Tables

Hydrolix stores data within tables. You can group tables together in logical namespaces called projects. To reference your data, use the full path project.table. For example: monitoring.http_logs.

Projects⚓︎

Projects are equivalent to databases in a traditional RDBMS. You can create any number of projects, as long as each name is unique. Use short, lower case names with underscores to improve readability (for example, http_logs).

For example, an organization could contain three different projects:

Multiple projects in an organization: Systems Monitoring, Stock Trading, and IOT

These projects contain three independent sets of unrelated data in the same deployment.

You can manage projects through the API or the Hydrolix user interface.

Create a project with the API⚓︎

You must authenticate to use the API.

Log in with your username/password.
Create a Project providing a name and description.

This example demonstrates a successful request and response made to the create project API:

RequestResponse

{
  "name": "monitoring",
  "description": "Global monitoring of web services"
}

{
    "uuid": "dfadb1a9-c2ec-4e3e-aab6-1117c5532843",
    "name": "monitoring",
    "description": "Global monitoring of web services",
    ...
}

The response contains the uuid of the new project. In subsequent requests to the API, include the project uuid path parameter in your request to interact with resources in the project.

Project settings⚓︎

The settings object specifies project-level configuration. It describes options such as default configuration for queries and rate limiting.

Property	Type	Purpose	Default	Required
`default_query_options`	object	See Query Options Reference for descriptions of each option.	See Query Options Reference	No
`blob`	string	Don't use this API option.	`null`	No
`rate_limit`	object	Limits bytes per second ingest rate and max payload size. See Rate Limits.	no limit	No

Tables⚓︎

A table and the associated transforms represents your data set. Hydrolix will store it as a compressed, sorted, two-dimensional data structure in a number of .hdx files in cloud storage. It's referenced by the project and you can have many tables in the same project.

The create table API endpoint allows you to define a name for your data, along with defaults and controls for:

Controls for stream ingest - hot/cold data parameters
Controls for autoingest - patterns and queues to read for notifications
Enable/disable background merge functionality to optimize your data storage
Data retention policies and removal of old data

Before you can ingest data you will need to define a transform (a write schema) for a table, describing the data types to use.

Tables are flexible by design

One table may have multiple ingest transforms. This increases the number of columns of a table. - Multiple transforms on a single table must share the same datetime column - The total number of columns is a union of all transforms - Ideal for very closely associated data sets arriving from different ingest methods

Manage tables using the REST API or the Web UI.

Create a Table using the API⚓︎

You will need to be authenticated to use the API.

An example create table API request/response exchange:

RequestResponse

{
  "name": "http_logs",
  "description" : "web logs"
}

{
    "project": "6b0692f9-c040-47b1-988a-582e57dd3631",
    "name": "http_logs",
    "description": "web_logs",
    "uuid": "94dba0fa-24f6-4962-9190-e47ead444ec4",
    "created": "2022-05-31T03:48:55.172580Z",
    "modified": "2022-05-31T03:48:55.172599Z",
    "settings": {
        "stream": {
            "hot_data_max_age_minutes": 3,
            "hot_data_max_active_partitions": 3,
            "hot_data_max_rows_per_partition": 12288000,
            "hot_data_max_minutes_per_partition": 1,
            "hot_data_max_open_seconds": 60,
            "hot_data_max_idle_seconds": 30,
            "cold_data_max_age_days": 3650,
            "cold_data_max_active_partitions": 50,
            "cold_data_max_rows_per_partition": 12288000,
            "cold_data_max_minutes_per_partition": 60,
            "cold_data_max_open_seconds": 300,
            "cold_data_max_idle_seconds": 60
        },
        "age": {
            "max_age_days": 0
        },
        "reaper": {
            "max_age_days": 1
        },
        "merge": {
            "enabled": true,
            "partition_duration_minutes": 60,
            "input_aggregation": 20000000000,
            "max_candidates": 20,
            "max_rows": 10000000,
            "max_partitions_per_candidate": 100,
            "min_age_mins": 1,
            "max_age_mins": 10080
        },
        "autoingest": {
            "enabled": false,
            "source": "",
            "pattern": "",
            "max_rows_per_partition": 12288000,
            "max_minutes_per_partition": 60,
            "max_active_partitions": 50,
            "input_aggregation": 1073741824,
            "dry_run": false
        },
        "sort_keys": [],
        "shard_key": null,
        "max_future_days": 0
    },
    "url": "https://my-domain.hydrolix.live/config/v1/orgs/0ffa6312-61ba-4620-8d57-96514a7f3859/projects/6b0692f9-c040-47b1-988a-582e57dd3631/tables/94dba0fa-24f6-4962-9190-e47ead444ec4"
}

The response contains the uuid of the created table. Reference resources contained within a table (like transforms) using the project uuid path parameter and table uuid in their API endpoints.

Table settings⚓︎

The settings object specifies table-level configuration.

Property	Type	Purpose	Default	Required
`default_query_options`	object	Set query options for this table. See Query Options Precedence and Query Options Reference.	See Query Options Reference	No
`rate_limit`	object	Limits bytes per second ingest rate and max payload size. See Rate Limits.	`null` (no limit)	No
`summary`	object	Set this option if you want to create a summary table.	`null`	No
`stream`	object	Set this option to configure stream ingest options for the table.		No
`age`	object	Use this setting to configure a data retention policy after which data will be deactivated.	See Data Lifecycle Management	No
`reaper`	object	Use this setting to configure a data retention policy after which data will be deleted.	See Data Lifecycle Management	No
`merge`	object	Enable/disable merge and configure the merge pools. See the Merge Pools documentation.	`"enabled": true`, all other nested options default to `null`	No
`autoingest`	array[object]	Enable and configure a continuous, batch ingest task for this table. See also Batch Ingest.	`"enabled": false`	No
`sort_keys`	array[string]	Change the sort order of data as it's ingested and stored. See also Table Settings Reference.	`null`, Hydrolix sorts columns according to cardinality	No
`shard_key`	string	Shard based on a specified key rather than the default, time-based. See also Table Settings Reference.	`null`, results in time-based sharding	No
`max_future_days`	integer	Retain rows with a timestamp less than this configured value of unit days. See also Table Settings Reference.	`0`	No
`max_request_bytes`	integer	Maximum allowed request size in bytes as measured by the content length of the request.	`0` (no maximum)	No
`storage_map`	object	Assigns a default storage bucket to a table. See also Table Settings Reference.	`turbine`	No