Skip to content

Subsystem - Query

Overview⚓︎

The query subsystem enables customers to read data out of the Hydrolix system. See the description of Query in the platform pages.

Hydrolix clusters support multiple Query Interfaces serving many client applications.

There are other client applications which natively read Hydrolix partitions, for example the Hydrolix Spark Connector. Clients like this do not depend on the query subsystem.

Components⚓︎

These are the query subsystem components in a Hydrolix cluster.

Component Description
Storage Locations of optimized, compressed, and indexed Hydrolix partition format
Catalog Metadata covering all partitions in all storage locations
Query Head Externally accessible server implementing Query Interfaces
Query Peer Worker software reading partitions from Storage
Zookeeper Registry of active and available query peers, used by query head
Traefik reverse proxy Network software allowing access to the Query Head

Diagram⚓︎

Hydrolix Query System Architecture

Process description⚓︎

One of the available query heads receives a query from the client. The incoming query interface is irrelevant to the query processing.

  1. A query head receives client's query and resolves all of the query options.
  2. The query head parses, plans, and optimizes the query. Any errors or failures encountered here terminate processing with an error sent to the client.
  3. The query head fetches a listing of partitions covering the time range from the catalog.
  4. The query head fetches available peers from Zookeeper.
  5. The query head constructs an execution plan from the optimized query, partition list and available peers.
  6. The query head sends a rewritten query

TODO: Mention the importance of data locality and the use of consistent hashing.

Troubleshoot⚓︎

Key metrics⚓︎

Example⚓︎