Set Up a Summary Table for Data Ingested via Kinesis

To configure a summary table, complete the following steps:

  1. Define summary transform.
  2. Create summary table.

Define Summary Transform

The summary transform defines a SQL statement that aggregates your data over some period of time.

The following example shows a general form for summary transforms:

   <time_field_aggregation> AS <alias>,
   <function_on_column> AS <alias>,
		hdx_primary_key = '<time_field_alias>'

Summary transforms are subject to the following limitations:

  • In the SETTINGS clause, you must specify a hdx_primary_key. This primary key should correspond to a time field aggregation alias in your SELECT statement.
  • In the FROM clause, specify the project name and the table name of the parent table.


Test Summary Transform

To test your summary transform, run it without the SETTINGS clause on the parent table.


Consider a summary table that aggregates, on an hourly basis:

  • the sum of item cost
  • the average tax
  • the 95th percentile of my.journeys

The following SQL shows how you could translate these requirements into a summary transform:

    toStartOfHour(timestamp) as hour,
    sum(cost) as sumcost,
    avg(tax) as avgtax,
    quantile(.95)(distance['local']) as _95th
group by
SETTINGS hdx_primary_key = 'hour'

Create Summary Table

The summary table persists the aggregation results generated by your summary transform. You can create your summary table with the Hydrolix UI or the API.

To create a table with the Hydrolix API, see the API Reference.

Your API request should contain a kinesis field that specifyies a parent_source similar to the following JSON request body:

	"name": "<table_name>",
	"type": "summary",
	"settings": {
		"summary": {
			"sql": "<summary transform>",
			"kinesis": {
				"parent_source": "<parent_project.parent_table>",
				"checkpointer": { // optional. defaults to parent checkpointer
					"name": "<custom checkpointer table>"

Add your summary transform here, as well as the project and table you'd like to use for your parent table. You can optionally specify a checkpointer table.