Query System Features

The Hydrolix query system supports the following features.

Name of feature	Brief description
Query head pools	Manage separate query head resources by routing ClickHouse clients by database parameter
Query peer resource pools	Manage independent, scalable query peer pools for different client populations or purposes
Authentication and Authorization	Verify client identity and authorize access
Account permissions	Enforce role-based access controls on connected clients
Data access controls	Allow data administrators to apply granular row- and column-level security controls
Spread list	Distribute partitions over multiple object storage locations to increase overall available object storage I/O requests per second
Shard key	Split data into separate partitions for a single specified column in addition to the primary timestamp
Column value mapping	Retrieve partitions from different object storage locations depending on a row's value
Query partition locality	Use consistent hashing on partition filenames to assign partitions to the same query peer in a resource pool when possible
Query caching	Store and return cached results for identical queries (not enabled by default)
Latest N rows	Return data early during `ORDER BY LIMIT` queries after collecting enough results, cancelling outstanding requests to query peers
Column aliases	Allow multiple names to refer to a single column
Custom views	Define a subset of columns available to querying clients
Calculated columns	Compute expressions from stored data before returning results to clients

System-specific features⚓︎

Query partition locality⚓︎

For every query received, the query head assigns the partitions by filename to a query peer.

Each query peer retrieves the partition manifest and index files and writes them to local disk. The much larger data files aren't cached.

The query head assigns partitions to peers using a consistent hash of the partition name. This increases the likelihood that a query peer will already have the necessary files in its local disk. This optimization minimizes the data transfer from object store when multiple queries require the same partition.