Projects & Tables

All data is stored within a table .Tables are grouped together in a logical namespace we call a project. The full path to reference your data is project.table i.e monitoring.http_logs

Projects

You can create any number of projects. The only restriction is a unique name. We recommend picking short, precise lower case names and to use underscores to break longer names. Consider a project as equivalent to a database in a typical RDBMS.

For example, in an organization you could have three different Projects, "Systems Monitoring", "Stock Trading", and "IOT" for 3 different sets of unrelated data. They can all co-exist in the same deployment.

830830

You can manage projects via REST API or the Web UI.

Creating a Project via API

You will need to be authenticated to use the API.

  • Login with your username/password.
  • Create Project providing a name and description (optional)

An example create project API request/response exchange:

{
  "name": "monitoring",
  "description": "Global monitoring of web services"
}
{
    "uuid": "dfadb1a9-c2ec-4e3e-aab6-1117c5532843",
    "name": "monitoring",
    "description": "Global monitoring of web services",
    ...
}

The response contains the uuid of the created project. All resources contained within a project (like tables and transforms) are referenced via the project uuid path parameter in their API endpoints. Therefore, you will need to store the project uuid.

Tables

A table (and the associated Write Transforms ) represents your data-set. Hydrolix will store it as a compressed, sorted, two-dimensional data structure in a number of .hdx files in cloud storage (AWS/GCP). It is referenced via the project and you can have many tables in the same project.

The Table API endpoint allows you to define a name for your data, along with:

  • controls for stream ingest - hot/cold data parameters
  • controls for auto-ingest - patterns and queues to read for notifications
  • enable/disable background merge functionality to optimize your data storage
  • TTL and removal of old data

You have full control of how data is ingested into a table, backed by sane defaults if you choose not to modify for the likes of streaming ingest.

Before you can ingest data you will need to define a transform (a write schema) for a table, describing the data types to use.

👍

Advanced concept: Tables are flexible by design!

One table may have multiple ingest transforms, essentially expanding the column width of a table.

  • multi transforms on a single table must share the same datetime column
  • the resulting table column width is a union of all transforms
  • ideal for very closely associated data-sets arriving from different ingest methods

The sample project used in tutorials include a variety of tables. Each table has several columns. The table metrics has columns timestamp, hostname, region, etc...

10461046

You can manage tables via REST API or the Web UI.

Create a Table via API

You will need to be authenticated to use the API.

  • Login with your username/password.
  • Create Table providing a name and ingest settings (optional)

An example create table API request/response exchange:

{
  "name": "http_logs",
  "description" : "web logs"
}
{
    "project": "6b0692f9-c040-47b1-988a-582e57dd3631",
    "name": "http_logs",
    "description": "web_logs",
    "uuid": "94dba0fa-24f6-4962-9190-e47ead444ec4",
    "created": "2022-05-31T03:48:55.172580Z",
    "modified": "2022-05-31T03:48:55.172599Z",
    "settings": {
        "stream": {
            "hot_data_max_age_minutes": 3,
            "hot_data_max_active_partitions": 3,
            "hot_data_max_rows_per_partition": 12288000,
            "hot_data_max_minutes_per_partition": 1,
            "hot_data_max_open_seconds": 60,
            "hot_data_max_idle_seconds": 30,
            "cold_data_max_age_days": 365,
            "cold_data_max_active_partitions": 50,
            "cold_data_max_rows_per_partition": 12288000,
            "cold_data_max_minutes_per_partition": 60,
            "cold_data_max_open_seconds": 300,
            "cold_data_max_idle_seconds": 60
        },
        "age": {
            "max_age_days": 0
        },
        "reaper": {
            "max_age_days": 1
        },
        "merge": {
            "enabled": true,
            "partition_duration_minutes": 60,
            "input_aggregation": 20000000000,
            "max_candidates": 20,
            "max_rows": 10000000,
            "max_partitions_per_candidate": 100,
            "min_age_mins": 1,
            "max_age_mins": 10080
        },
        "autoingest": {
            "enabled": false,
            "source": "",
            "pattern": "",
            "max_rows_per_partition": 12288000,
            "max_minutes_per_partition": 60,
            "max_active_partitions": 50,
            "input_aggregation": 1073741824,
            "dry_run": false
        },
        "sort_keys": [],
        "shard_key": null,
        "max_future_days": 0
    },
    "url": "https://my-domain.hydrolix.live/config/v1/orgs/0ffa6312-61ba-4620-8d57-96514a7f3859/projects/6b0692f9-c040-47b1-988a-582e57dd3631/tables/94dba0fa-24f6-4962-9190-e47ead444ec4"
}

The response contains the uuid of the created table. All resources contained within a table (like transforms) are referenced via the project uuid path parameter and table uuid in their API endpoints. Therefore, you will need to store the table uuid.


Did this page help you?