Scaling Guidelines
Hydrolix understands that data comes in all shapes and sizes. For this reason Hydrolix supplies the following guidance on how to scale your architecture so that it is most suitable for your data.
Batch Intake
Scaling of Batch Intake is typically determined by the size of the data to be imported - the size of the files to be imported directly affect memory utilization. For this reason, Hydrolix recommends the r5
AWS instance type.
Files Under 2GB Compressed (~20GB RAW)
the recommended instance is r5.2xlarge
Files greater than 2GB Compressed
Batch peers will consume RAM roughly 4x the size of the RAW data and 10x the size of compressed. For files larger than 2GB. Apply the following formula:
(Max batch file size) * 10 = Instance memory requirements
Example: 8 GB * 10 = 80 GiB
In this case the recommended instance size will be r5.4xlarge
.
Memory (GB) | Instance Size |
---|---|
64 | r5.2xlarge |
128 | r5.4xlarge |
256 | r5.8xlarge |
384 | r5.12xlarge |
512 | r5.16xlarge |
768 | r5.24xlarge |
Streaming Intake
There is (theoretically speaking) no hard limit on the size of messages Hydrolix can ingest the Stream API. The only practical consideration is ensuring that the Stream Head instances have sufficient RAM to handle the (uncompressed) size of each message. Behind the scenes, the stream head will split messages up as needed to fit within the limits of Kinesis.
Query
Events Per Second | Recommended Partition Size (Mins) | Max Rows | Partitions per Day | Recommended Query Peer Instance Type | Available Cores | Partitions per core |
---|---|---|---|---|---|---|
18,500 | 60 | 66,600,000 | 24 | 1x c5n.xlarge | 3 | 8 |
37,000 | 30 | 66,600,000 | 48 | 1x c5n.2xlarge | 7 | 7 |
74,000 | 15 | 66,600,000 | 96 | 1x c5n.4xlarge | 15 | 6 |
222,000 | 5 | 66,600,000 | 288 | 1x c5n.9xlarge | 35 | 8 |
1,110,000 | 1 | 66,600,000 | 1440 | 1x c5n.18xlarge | 213 | 7 |
Updated 4 months ago