HTTP Proxy Advanced Configuration

Advanced configuration options for HTTP proxy

Overview

When the HTTP proxy is enabled, it generates log records for every query it handles. These logs help with analytics, including reporting on cache hits and misses.

Log delivery with Vector and Argus

Vector collects logs from all components, including turbine and http-proxy. It forwards them to:

  • Argus cluster: For centralized analytics and reporting
  • hydro.logs table: For local visibility and querying within each Hydrolix cluster

Logs must use structured JSON to ensure correct parsing. Fields not directly mapped to a schema column are stored in the catchall column. All query-specific fields are embedded in the message object.

Base JSON log format for all services

All logs have this envelope:

{
  "timestamp": "2025-07-30T10:15:32.123456-0700",
  "component": "http-proxy",  // or "turbine"
  "level": "info",
  "message": { ... }          // structured JSON object
}

HTTP proxy log format

The HTTP proxy logs two messages per query, one at the beginning and another at the end. These logs are useful for measuring cache performance and auditing incoming query requests.

The cache_hit field can be set to 1 to show a cache hit, or to 0 to show a cache miss, and its results are collected in the catchall map field.

Before the query is processed:

{
  "query_phase": "begin",
  "user": "example_user",
  "query": "SELECT * FROM my_table",
  "interface": "http",
  "query_start_time": 1722354932,
  "cache_hit": "1",
  "query_id": "abc123",
  "initial_query_id": "abc123"
}

After the query is processed:

{
  "query_phase": "end",
  "query_attempts": "1",
  "exception": "",
  "exception_code": "",
  "user": "example_user",
  "query": "SELECT * FROM my_table",
  "interface": "http",
  "query_start_time": 1722354932,
  "cache_hit": "1",
  "query_id": "abc123",
  "initial_query_id": "abc123"
}

These fields are embedded in the message property of the main log object.