Core

Hydrolix uses several "Core" components to operate the system, including the Database bucket and the Catalog:

Database Bucket

Database buckets use the name convention hdxcli-xxxxxx. Each Database bucket contains:

platform configuration
table and database configurations
platform tunables
logs
data files encoded in the HDX file format

Hydrolix stores database files as partitions. Each partition corresponds to a time period, and contains both raw data and indexes.

For example, an AWS deployed service uses the following directory structure:

hdxcli-xxxxyy99
├──cf_templates/
├──config/
├──db/
├──hdxinf/
├──logs/
├──results/
└──secrets/

Catalog

The Catalog is a database instance used to manage state information for the stateless components. The Catalog contains:

metadata for the data partitions stored within the Database bucket
information on the jobs and tasks executed as part of ingestion

You can query the Catalog using the reserved view for each table, accessible via the suffix #.catalog:

query-peer :) select * from sample.`my_data#.catalog` limit 1

SELECT *
FROM sample.`my_data#.catalog`
LIMIT 1

Query id: 2627bd78-bc1b-4e4d-bff1-1d30789d9d69

┌─partition────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬───────min_timestamp─┬───────max_timestamp─┬─manifest_size─┬─data_size─┬─index_size─┬─rows─┬─mem_size─┬─root_path─────────────────────────────────────────────────────────────────┬─shard_key─┐
│ b78ff71c-639e-4da4-b194-732630e6b5bb/55a880fb-a606-4bec-88a6-277c6bc9ec03/data/v2/current/1230768000-1435622400-e9a81367ca4523ea.hdx │ 2009-01-01 00:00:00 │ 2015-06-30 00:00:00 │           880 │     15815 │         29 │ 5000 │   141845 │ b78ff71c-639e-4da4-b194-732630e6b5bb/55a880fb-a606-4bec-88a6-277c6bc9ec03 │           │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────┴─────────────────────┴───────────────┴───────────┴────────────┴──────┴──────────┴───────────────────────────────────────────────────────────────────────────┴───────────┘

Other Components

Hydrolix also uses the following minor components:

Component	Description	Scale to 0
Keycloak	Manages Role-Based Access Control (RBAC) and access control for services.	No
Operator	Kubernetes Operator that manages Hydrolix scale and infrastructure.	No
Validator	Service for testing Hydrolix Transforms that translate data from input to the internal schema.	Yes
Version	Endpoint that displays the current running version of the Hydrolix platform.	No
Vector	Deployed on each Node, used to collect Logs from each service.	No