Hydrolix AWS Architecture

The following diagram provides a high level view of the Hydrolix Stack as deployed in AWS.

image

Components

The following lists the major components and their purpose. On an initial build, the infrastructure will use a minimal size using the following size resources.

Hydrolix Core

Component Description AWS Service Scale to 0 Default Type (Scale)
Bastion Bastion Management node that allows access by Hydrolix to support the infrastructure and customer. EC2 No t2.micro (x1)
Catalog Contains information for data stored within S3. Amazon RDS (Postgres) No db.t2.medium (HA Pair)
Config API API server that enables the command and control API. EC2 No t2.micro (x1)
Deploy A Lambda function that enables the creation of a Hydrolix environment. Lambda N/A -
Stream Checkpoint Keeps ongoing state information of incoming ingest streams. DynamoDb N/A -
S3 Bucket (hdxcli-xxx) The primary storage mechanism for system and database configurations and data stored by the system. S3 N/A -
User Interface Contains the logic for displaying the user interface EC2 Yes t2.micro (x1)
Zookeeper Manages the Head and Peer clusters for streaming and query. EC2 Yes t2.micro (x3)

Hydrolix Compute

Batch Intake

Batch Intake provides a mechanism in which to scan an S3 bucket or directory and import it into the Hydrolix format. To enable batch intake it utilizes the Config API to ingest data, more information can be found here - Ingesting

Component Description AWS Service Scale to 0 Default Type (Scale)
Batch Intake API/Batch Lister Used to manage the import of a batch of data. Lambda Yes -
Batch Peers Workers that ingest and encode data into the HDX format. EC2 Yes r5n.2xlarge (x1)
Queues Two queues that are used to manage the files being imported. SQS No -

Stream Intake

These components are used for the streaming ingest of your data. Streaming Intake can utilize two mechanisms, pre-described data and self described data. More information on Streaming ingest can be found here - Ingesting

The stream head provides the API interface that data is sent too. The URL path to the Stream Head is as follows:

https://<host>/ingest>
Component Description AWS Service Scale to 0 Default Type (Scale)
Stream Head Used to manage the import of stream of data and entry point of a data-stream EC2 Yes m5.large (x1)
Stream Peers Workers that ingest and encode data into the HDX format. EC2 Yes m5.large (x1)
Queues Two queues that are used to manage the files being imported. Kinesis No 2 Shards

Query

These components are used for the retrieval of data from the datastore. The query head provides the API interface that can be used to make SQL queries. The URL path to the Query Head is as follows:

https://<host>/query?query=<sql>

More information on Query can be found here - Querying

Component Description AWS Service Scale to 0 Default Type (Scale)
Query Head The API end-point for queries. Manages and aggregates answers from the query head for issue to the end-user. EC2 Yes c5n.large (x1)
Query Peer Two queues that are used to manage the files being imported. EC2 Yes c5n.4xlarge (x1)

Optional Components

These components are optional and are provided for ease of use. The services capabilities are installed at initial build, however can be completely scaled to 0.

Component Description AWS Service Scale to 0 Default Type (Scale) More Information
Grafana A pre-integrated version of the Grafana analytis and visualistion tool EC2 Yes t2.micro(x1) grafana