Scaling Guidelines

Hydrolix understands that data comes in all shapes and sizes. For this reason Hydrolix supplies the following guidance on how to scale your architecture so that it is most suitable for your data.

Batch Intake

Scaling of Batch Intake is typically determined by the size of the data to be imported - the size of the files to be imported directly affect memory utilization. For this reason, Hydrolix recommends the r5 AWS instance type.

Files Under 2GB Compressed (~20GB RAW)

the recommended instance is r5.2xlarge

Files greater than 2GB Compressed

Batch peers will consume RAM roughly 4x the size of the RAW data and 10x the size of compressed. For files larger than 2GB. Apply the following formula:

(Max batch file size) * 10 = Instance memory requirements

Example: 8 GB * 10 = 80 GiB

In this case the recommended instance size will be r5.4xlarge.

Memory (GB)Instance Size
64r5.2xlarge
128r5.4xlarge
256r5.8xlarge
384r5.12xlarge
512r5.16xlarge
768r5.24xlarge

Streaming Intake

There is (theoretically speaking) no hard limit on the size of messages Hydrolix can ingest the Stream API. The only practical consideration is ensuring that the Stream Head instances have sufficient RAM to handle the (uncompressed) size of each message. Behind the scenes, the stream head will split messages up as needed to fit within the limits of Kinesis.

Query

Events Per SecondRecommended Partition Size (Mins)Max RowsPartitions per DayRecommended Query Peer Instance TypeAvailable CoresPartitions per core
18,5006066,600,000241x c5n.xlarge38
37,0003066,600,000481x c5n.2xlarge77
74,0001566,600,000961x c5n.4xlarge156
222,000566,600,0002881x c5n.9xlarge358
1,110,000166,600,00014401x c5n.18xlarge2137