Insert / Save Data Manually

Use manual data insertion for ad-hoc jobs, not for ongoing insertion or ingestion.

For standard data ingestion that scales with load, either use Stream Ingest or Batch Ingest . Summary Tables should be used for aggregations.

There are two ways to insert data manually:

  • From a query result
  • From raw data

Set up a table

See Projects & Tables to create a table.

Configure the default transform

A default transform must be deployed. See Write Transforms for more detail.

{
    "name": "mytransform",
    "type": "json",
    "description": "an example for manual data insertion",
    "settings": {
        "is_default": true,
        "output_columns": [
            {
                "name": "timestamp",
                "datatype": {
                    "primary": true,
                    "resolution": "seconds",
                    "format": "2006-01-02 15:04:05",
                    "type": "datetime"
                }
            },
            {
                "name": "mycolumn1",
                "datatype": {
                    "type": "string"
                }
            },
            {
                "name": "mycolumn2",
                "datatype": {
                    "type": "uint32"
                }
            }
        ],
        "compression": "none",
        "format_details": {}
    }
}

⚠️

Limitations

  • Insertion fails if the primary timestamp field is missing
  • No ingestion scripting features execute when inserting data manually
  • The transform must be the default: "is_default": true

Insert data

You can query another Hydrolix table in the cluster and directly insert the query results. For data not already in the cluster, you can also manually insert data.

Insert from a query result

To insert from a query result, combine the INSERT INTO statement with a SELECT statement.

INSERT INTO myproject.mytable (timestamp, mycolumn1, mycolumn2)
SELECT toStartOfDay(timestamp) as day, mycolumn1, mycolumn2
FROM myproject.anothertable
GROUP BY day, mycolumn1

This generates a query with the SELECT toStartOfDay(timestamp) as day ... and inserts the resulting rows into myproject.mytable.

Insert manually

Insert data manually.

INSERT INTO myproject.mytable (timestamp, mycolumn1, mycolumn2)
VALUES ('2021-11-30 15:04:05', 'aardvark', 1)

You can insert multiple rows in a single statement.

INSERT INTO myproject.mytable (timestamp, mycolumn1, mycolumn2)
VALUES ('2021-11-30 15:04:05', 'badger', 1),('2021-12-01 15:04:05', 'civet', 2)

After the above two commands

> SELECT timestamp, mycolumn1, mycolumn2 FROM myproject.mytable
aardvark        1       2021-11-30 15:04:05
badger  1       2021-11-30 15:04:05
civet   2       2021-12-01 15:04:05

The response for the INSERT INTO is a JSON object containing the bytes written, number of partitions, and rows written. Authorization credentials are omitted from the below example, see HTTP Query API for examples.

$ curl -X POST --url "$HDX_HYDROLIX_URL/query" --data-binary @insert-into.sql
{
    "bytes_written": 194315,
    "num_partitions": 1,
    "rows_written": 13872
}