Skip to content

Scaling Query Peer Local Storage

Local disk cache on query peers⚓︎

The storage value for query-peer pods controls the size of the local disk cache on ephemeral storage. Query peers cache files from cloud object storage locally to avoid re-fetching them on repeated reads.

Prior to v6.0, this cache stored partition metadata files: manifests, dictionaries, and indexes.

Starting in v6.0, the cache also can also hold column data (data.hdx files) through byte-range caching.

Byte-range caching stores specific portions, or byte ranges, of a large file in a cache, rather than storing the entire file. This can significantly reduce query latency for frequently accessed data by serving reads from local disk instead of cloud storage.

Because data blocks are typically much larger than metadata, enabling byte-range caching increases ephemeral storage consumption. If pods consistently approach their cache eviction thresholds after upgrading to v6.0 and setting disable_data_cache to false, monitor query-peer disk usage and increase the storage allocation or use the cache-related tunables to evict more aggressively. The appropriate size depends on your working set which is the volume of data that queries access repeatedly.

The following tunables control when the cache evicts files:

Tunable Default Purpose
disable_data_cache true Disable column data caching and revert to pre-v6.0 behavior
disk_cache_cull_start_perc 75 Eviction begins when disk usage reaches this percentage
disk_cache_cull_stop_perc 65 Eviction stops when disk usage falls to this percentage
disk_cache_redzone_start_perc 90 New cache writes are refused above this threshold