Projects & Tables

All data is stored within a table .Tables are grouped together in a logical namespace we call a project. The full path to reference your data is project.table i.e monitoring.http_logs

Projects

You can create any number of projects. The only restriction is a unique name. We recommend picking short, precise lower case names and to use underscores to break longer names. Consider a project as equivalent to a database in a typical RDBMS.

For example, in an organization you could have three different Projects, "Systems Monitoring", "Stock Trading", and "IOT" for 3 different sets of unrelated data. They can all co-exist in the same deployment.

830

You can manage projects via REST API or the Web UI.

Creating a Project via API

You will need to be authenticated to use the API.

  • Login with your username/password.
  • Create Project providing a name and description (optional)

An example create project API request/response exchange:

{
  "name": "monitoring",
  "description": "Global monitoring of web services"
}
{
    "uuid": "dfadb1a9-c2ec-4e3e-aab6-1117c5532843",
    "name": "monitoring",
    "description": "Global monitoring of web services",
    ...
}

The response contains the uuid of the created project. All resources contained within a project (like tables and transforms) are referenced via the project uuid path parameter in their API endpoints. Therefore, you will need to store the project uuid.

Tables

A table (and the associated Write Transforms ) represents your data-set. Hydrolix will store it as a compressed, sorted, two-dimensional data structure in a number of .hdx files in cloud storage (AWS/GCP). It is referenced via the project and you can have many tables in the same project.

The Table API endpoint allows you to define a name for your data, along with:

  • controls for stream ingest - hot/cold data parameters
  • controls for auto-ingest - patterns and queues to read for notifications
  • enable/disable background merge functionality to optimize your data storage
  • TTL and removal of old data

You have full control of how data is ingested into a table, backed by sane defaults if you choose not to modify for the likes of streaming ingest.

Before you can ingest data you will need to define a transform (a write schema) for a table, describing the data types to use.

👍

Advanced concept: Tables are flexible by design!

One table may have multiple ingest transforms, essentially expanding the column width of a table.

  • multi transforms on a single table must share the same datetime column
  • the resulting table column width is a union of all transforms
  • ideal for very closely associated data-sets arriving from different ingest methods

The sample project used in tutorials include a variety of tables. Each table has several columns. The table metrics has columns timestamp, hostname, region, etc...

1046

You can manage tables via REST API or the Web UI.

Create a Table via API

You will need to be authenticated to use the API.

  • Login with your username/password.
  • Create Table providing a name and ingest settings (optional)

An example create table API request/response exchange:

{
  "name": "http_logs",
  "description" : "web logs"
}
{
    "project": "6b0692f9-c040-47b1-988a-582e57dd3631",
    "name": "http_logs",
    "description": "web_logs",
    "uuid": "94dba0fa-24f6-4962-9190-e47ead444ec4",
    "created": "2022-05-31T03:48:55.172580Z",
    "modified": "2022-05-31T03:48:55.172599Z",
    "settings": {
        "stream": {
            "hot_data_max_age_minutes": 3,
            "hot_data_max_active_partitions": 3,
            "hot_data_max_rows_per_partition": 12288000,
            "hot_data_max_minutes_per_partition": 1,
            "hot_data_max_open_seconds": 60,
            "hot_data_max_idle_seconds": 30,
            "cold_data_max_age_days": 365,
            "cold_data_max_active_partitions": 50,
            "cold_data_max_rows_per_partition": 12288000,
            "cold_data_max_minutes_per_partition": 60,
            "cold_data_max_open_seconds": 300,
            "cold_data_max_idle_seconds": 60
        },
        "age": {
            "max_age_days": 0
        },
        "reaper": {
            "max_age_days": 1
        },
        "merge": {
            "enabled": true,
            "partition_duration_minutes": 60,
            "input_aggregation": 20000000000,
            "max_candidates": 20,
            "max_rows": 10000000,
            "max_partitions_per_candidate": 100,
            "min_age_mins": 1,
            "max_age_mins": 10080
        },
        "autoingest": {
            "enabled": false,
            "source": "",
            "pattern": "",
            "max_rows_per_partition": 12288000,
            "max_minutes_per_partition": 60,
            "max_active_partitions": 50,
            "input_aggregation": 1073741824,
            "dry_run": false
        },
        "sort_keys": [],
        "shard_key": null,
        "max_future_days": 0
    },
    "url": "https://my-domain.hydrolix.live/config/v1/orgs/0ffa6312-61ba-4620-8d57-96514a7f3859/projects/6b0692f9-c040-47b1-988a-582e57dd3631/tables/94dba0fa-24f6-4962-9190-e47ead444ec4"
}

The response contains the uuid of the created table. All resources contained within a table (like transforms) are referenced via the project uuid path parameter and table uuid in their API endpoints. Therefore, you will need to store the table uuid.


What’s Next