Projects & Tables
Hydrolix stores data within tables. You can group tables together in logical namespaces called projects. To reference your data, use the full path project.table
. For example: monitoring.http_logs
.
Projects
Projects are equivalent to databases in a traditional RDBMS. You can create any number of projects, as long as each name is unique. Use short, lower case names with underscores to improve readability. `httpl
For example, an organization could contain three different projects:
These projects contain three independent sets of unrelated data in the same deployment.
You can manage projects via the API or the Hydrolix user interface.
Create a project with the API
You must authenticate to use the API.
- Log in with your username/password.
- Create a Project providing a name and description.
This example demonstrates a successful request and response made to the create project API:
{
"name": "monitoring",
"description": "Global monitoring of web services"
}
{
"uuid": "dfadb1a9-c2ec-4e3e-aab6-1117c5532843",
"name": "monitoring",
"description": "Global monitoring of web services",
...
}
The response contains the uuid
of the new project. In subsequent requests to the API, include the project uuid
path parameter in your request to interact with resources in the project.
Project settings
The settings
object specifies project-level configuration. It describes options such as default configuration for queries and rate limiting.
Property | Type | Purpose | Default | Required |
---|---|---|---|---|
default_query_options | object | See Query Options Reference for descriptions of each option. | See Query Options Reference | No |
blob | string | Don't use this API option. | null | No |
rate_limit | object | Limits bytes per second ingest rate and max payload size. See Rate Limits. | no limit | No |
Tables
A table and the associated transforms represents your data set. Hydrolix will store it as a compressed, sorted, two-dimensional data structure in a number of .hdx
files in cloud storage. It's referenced by the project and you can have many tables in the same project.
The create table API endpoint allows you to define a name for your data, along with defaults and controls for:
- Controls for stream ingest - hot/cold data parameters
- Controls for autoingest - patterns and queues to read for notifications
- Enable/disable background merge functionality to optimize your data storage
- TTL and removal of old data
Before you can ingest data you will need to define a transform (a write schema) for a table, describing the data types to use.
Tables are flexible by design
One table may have multiple ingest transforms. This increases the number of columns of a table.
- Multiple transforms on a single table must share the same datetime column
- The total number of columns is a union of all transforms
- Ideal for very closely associated data sets arriving from different ingest methods
Manage tables using the REST API or the Web UI.
Create a Table using the API
You will need to be authenticated to use the API.
An example create table API request/response exchange:
{
"name": "http_logs",
"description" : "web logs"
}
{
"project": "6b0692f9-c040-47b1-988a-582e57dd3631",
"name": "http_logs",
"description": "web_logs",
"uuid": "94dba0fa-24f6-4962-9190-e47ead444ec4",
"created": "2022-05-31T03:48:55.172580Z",
"modified": "2022-05-31T03:48:55.172599Z",
"settings": {
"stream": {
"hot_data_max_age_minutes": 3,
"hot_data_max_active_partitions": 3,
"hot_data_max_rows_per_partition": 12288000,
"hot_data_max_minutes_per_partition": 1,
"hot_data_max_open_seconds": 60,
"hot_data_max_idle_seconds": 30,
"cold_data_max_age_days": 3650,
"cold_data_max_active_partitions": 50,
"cold_data_max_rows_per_partition": 12288000,
"cold_data_max_minutes_per_partition": 60,
"cold_data_max_open_seconds": 300,
"cold_data_max_idle_seconds": 60
},
"age": {
"max_age_days": 0
},
"reaper": {
"max_age_days": 1
},
"merge": {
"enabled": true,
"partition_duration_minutes": 60,
"input_aggregation": 20000000000,
"max_candidates": 20,
"max_rows": 10000000,
"max_partitions_per_candidate": 100,
"min_age_mins": 1,
"max_age_mins": 10080
},
"autoingest": {
"enabled": false,
"source": "",
"pattern": "",
"max_rows_per_partition": 12288000,
"max_minutes_per_partition": 60,
"max_active_partitions": 50,
"input_aggregation": 1073741824,
"dry_run": false
},
"sort_keys": [],
"shard_key": null,
"max_future_days": 0
},
"url": "https://my-domain.hydrolix.live/config/v1/orgs/0ffa6312-61ba-4620-8d57-96514a7f3859/projects/6b0692f9-c040-47b1-988a-582e57dd3631/tables/94dba0fa-24f6-4962-9190-e47ead444ec4"
}
The response contains the uuid
of the created table. Reference resources contained within a table (like transforms) using the project uuid
path parameter and table uuid
in their API endpoints.
Table settings
The settings
object specifies table-level configuration.
Property | Type | Purpose | Default | Required |
---|---|---|---|---|
default_query_options | object | Set query options for this table. See Query Options Precedence and Query Options Reference. | See Query Options Reference | No |
rate_limit | object | Limits bytes per second ingest rate and max payload size. See Rate Limits. | null (no limit) | No |
summary | object | Set this option if you want to create a summary table. | null | No |
stream | object | Set this option to configure stream ingest options for the table. | No | |
age | object | Use this setting to configure a TTL after which data will be deactivated. | See Data Lifecycle Management | No |
reaper | object | Use this setting to configure a TTL after which data will be deleted. | See Data Lifecycle Management | No |
merge | object | Enable/disable merge and configure the merge pools. See the Merge Pools documentation. | "enabled": true , all other nested options default to null | No |
autoingest | array[object] | Enable and configure a continuous, batch ingest task for this table. See also Batch Ingest. | "enabled": false | No |
sort_keys | array[string] | Change the sort order of data as it's ingested and stored. See also Table Settings Reference. | null , Hydrolix sorts columns according to cardinality | No |
shard_key | string | Shard based on a specified key rather than the default, time-based. See also Table Settings Reference. | null , results in time-based sharding | No |
max_future_days | integer | Retain rows with a timestamp less than this configured value of unit days. See also Table Settings Reference. | 0 | No |
max_request_bytes | integer | Maximum allowed request size in bytes as measured by the content length of the request. | 0 (no maximum) | No |
storage_map | object | Assigns a default storage bucket to a table. See also Table Settings Reference. | turbine | No |
Updated 26 days ago