Endpoint Errors

Summary uses the same components as Streaming, so any Summary errors should also refer to Streaming.

Issue: Summary Data not Created

Check Summary Sources API Values

API: https://docs.hydrolix.io/reference/list-summary-sources

Check:

  • parent_table - The name of the parent table in Streaming ingest for the incoming data
  • subtype – Must be called “summary”
  • table – the target “project.tablename”
  • transform – the name of the transform to use to ingest the table
  • type – must be “pull”
  • service – Must be “summary-peer”

Check Stream-Peer Logs and Stream-Peer-Turbine Logs

API: System Health

Check

  • Datetime format for Primary. This is often the cause of rejection of rows.
  • Strings are trying to be stored as a UINT.
  • Transform SQL

Review transform and edit accordingly.

https://docs.hydrolix.io/docs/transforms-and-write-schema

https://docs.hydrolix.io/docs/timestamp-data-types

https://docs.hydrolix.io/docs/summary-tables-aggregation

Check the Transform

API: https://docs.hydrolix.io/reference/list-transforms

  • Ensure the output columns for the parent table match the incoming data.
  • Check the transform SQL outputs the expected values to the summary transform.

Stream Summary Service Components

Accessible via the path: https://<yourhost>.hydrolix.live/ingest/event

ComponentUsed to
Traefik - Application Load-balancerRoutes requests to appropriate end-points. Requests to the path /ingest/ are routed to stream head components. API and Portal requests are routed via their own paths.
Stream-HeadReceives HTTP requests and sends them to RedPanda. Completes basic verification on incoming messages including datetime, checks, header information checks, basic transform checks and incoming message format. Manages message size onto the queue, if the message is too big it will split messages into suitable sizes.
RedPandaQueues messages received from the stream head. Uses persistent volumes for queues.
Summary-Peer (Summary-Peer)Consumes messages off the Redpanda queue.
Summary-Peer (Indexer)Two stage parsing of data. Firstly data is parsed using the “Parent” tables transform, including any functions, dictionaries or enrichments. The second stage applies the Summaries transform settings including any enrichments. Sends completed Summary files to Cloud storage (GCS, S3 etc) and updates the Postgres Catalog with partition metadata.
CatalogStores information on the basic storage structure and partitions of the data within Cloud Storage (GCS, S3 etc). Includes a persistent volume within Kubernetes.
Cloud Storage BucketStorage bucket (GCS, S3 etc) containing the “stateful” data required to run the system, including configuration files (/config/), database (/db/) and a copy of the system logs (/logs/).
UIUser interface / Portal. Is built upon the Turbine-API.
Turbine-APIREST based API for configuration of the data system. Includes API end-points for creation, deletion, editing of tables and their transforms (schemas).
KeycloakProvides authorization and RBAC for access to the Turbine-API and the Portal. Stores metadata and user information with the Catalog DB instance.