Summary uses the same components as Streaming, so any Summary errors should also refer to Streaming.
parent_table- The name of the parent table in Streaming ingest for the incoming data
subtype– Must be called “summary”
table– the target “project.tablename”
transform– the name of the transform to use to ingest the table
type– must be “pull”
service– Must be “summary-peer”
API: System Health
- Datetime format for Primary. This is often the cause of rejection of rows.
- Strings are trying to be stored as a UINT.
- Transform SQL
Review transform and edit accordingly.
- Ensure the output columns for the parent table match the incoming data.
- Check the transform SQL outputs the expected values to the summary transform.
Accessible via the path:
|Traefik - Application Load-balancer||Routes requests to appropriate end-points. Requests to the path /ingest/ are routed to stream head components. API and Portal requests are routed via their own paths.|
|Stream-Head||Receives HTTP requests and sends them to RedPanda. Completes basic verification on incoming messages including datetime, checks, header information checks, basic transform checks and incoming message format. Manages message size onto the queue, if the message is too big it will split messages into suitable sizes.|
|RedPanda||Queues messages received from the stream head. Uses persistent volumes for queues.|
|Summary-Peer (Summary-Peer)||Consumes messages off the Redpanda queue.|
|Summary-Peer (Indexer)||Two stage parsing of data. Firstly data is parsed using the “Parent” tables transform, including any functions, dictionaries or enrichments. The second stage applies the Summaries transform settings including any enrichments. Sends completed Summary files to Cloud storage (GCS, S3 etc) and updates the Postgres Catalog with partition metadata.|
|Catalog||Stores information on the basic storage structure and partitions of the data within Cloud Storage (GCS, S3 etc). Includes a persistent volume within Kubernetes.|
|Cloud Storage Bucket||Storage bucket (GCS, S3 etc) containing the “stateful” data required to run the system, including configuration files (/config/), database (/db/) and a copy of the system logs (/logs/).|
|UI||User interface / Portal. Is built upon the Turbine-API.|
|Turbine-API||REST based API for configuration of the data system. Includes API end-points for creation, deletion, editing of tables and their transforms (schemas).|
|Keycloak||Provides authorization and RBAC for access to the Turbine-API and the Portal. Stores metadata and user information with the Catalog DB instance.|
Updated 5 months ago