Query System Features
The Hydrolix query system supports the following features.
| Name of feature | Brief description |
|---|---|
| Query head pools | Manage separate query head resources by routing ClickHouse clients by database parameter |
| Query peer resource pools | Manage independent, scalable query peer pools for different client populations or purposes |
| Authentication and Authorization | Verify client identity and authorize access |
| Account permissions | Enforce role-based access controls on connected clients |
| Data access controls | Allow data administrators to apply granular row- and column-level security controls |
| Spread list | Distribute partitions over multiple object storage locations to increase overall available object storage I/O requests per second |
| Shard key | Split data into separate partitions for a single specified column in addition to the primary timestamp |
| Storage mapping | Retrieve partitions from different object storage locations depending on a row's value |
| Query partition locality | Use consistent hashing on partition filenames to assign partitions to the same query peer in a resource pool when possible |
| Query caching | Store and return cached results for identical queries (not enabled by default) |
| Latest N rows | Return data early during ORDER BY LIMIT queries after collecting enough results, cancelling outstanding requests to query peers |
| Column aliases | Allow multiple names to refer to a single column |
| Custom views | Define a subset of columns available to querying clients |
| Calculated columns | Compute expressions from stored data before returning results to clients |
System-specific features⚓︎
Query partition locality⚓︎
For every query received, the query head assigns the partitions by filename to a query peer.
Each query peer retrieves the partition manifest and index files and writes them to local disk. The much larger data files aren't cached.
The query head assigns partitions to peers using a consistent hash of the partition name. This increases the likelihood that a query peer will already have the necessary files in its local disk. This optimization minimizes the data transfer from object store when multiple queries require the same partition.