Rejects
When ingesting data into the Hydrolix platform there are occasions when incorrect data is sent by a system that can not be loaded. This data can be rejected from the system. To help ease debugging the rejected files and rows are stored into a separate directory for later review.
Reject Locations.
Rejects are stored within cloud storage. The path can be found as follows:
{storagebucket}/db/hdx/{project_uuid}/{table_uuid}/unknown/{service type}
For example in a GKE deployed architecture the following are 3 reject files:
$ gsutil ls gs://hdxcli-gcpprodv/db/hdx/3470db40-bf27-44f9-bc0b-a0890ba2fea8/61d30656-dc13-49eb-ae44-234b37e2b2e4/unknown/stream/
gs://hdxcli-gcpprodv/db/hdx/3470db40-bf27-44f9-bc0b-a0890ba2fea8/61d30656-dc13-49eb-ae44-234b37e2b2e4/unknown/stream/20220912142005-rejects-format-YkLZTZVbHbow.json
gs://hdxcli-gcpprodv/db/hdx/3470db40-bf27-44f9-bc0b-a0890ba2fea8/61d30656-dc13-49eb-ae44-234b37e2b2e4/unknown/stream/20220913045633-rejects-format-KN6bVopaF56O.json
gs://hdxcli-gcpprodv/db/hdx/3470db40-bf27-44f9-bc0b-a0890ba2fea8/61d30656-dc13-49eb-ae44-234b37e2b2e4/unknown/stream/20220913210236-rejects-format-3PaNbHf55l0F.json
To find the UUID's for your projects and tables these can be found using the Summary end-point within the API Organization Summary.
Reject File Format
The rejects file or message is written as a JSON object in the format of:
{
"project_id": "project_uuid",
"table_id": "table_uuid",
"transform_id": "",
"data": [
"data1": "data1",
"data2": "data2"
],
"reason": "The reason for the failure"
}
The project_id
, table_id
and transform_id
are UUID's of the project table and transform the data was destined for. Note the transform_id may not show up, where it is blank it is likely a default transform has been provided.
The data
object contains the data/messages that have been rejected.
Finally the reason
object contains the reason for the rejection. The following are some examples of common failures:
Failure | Reason |
---|---|
"strconv.ParseInt: parsing "text": invalid syntax" | The field has been given a string value when a numeric Int was expected. The error can contain the name of the column. Often it is easiest to search through the data to see what has been supplied and match this to the transform. |
"strconv.ParseUInt: parsing "unknown": invalid syntax" | The field has been given a string value of "unknown" when and UInt was expected. This can happen with custom null values, to enable custom nulls use the nulls attribute within a Transforms Output columns |
"reason": "unexpected EOF" | Often occurs when a message body is independently compressed and this has not been enabled within the Transform. |
"reason":"event primary too old" | Occurs when an event primary being loaded is beyond the max amount of time that is specified in the tables settings cold_data_max_age_days . |
Updated 8 months ago