Skip to content

Run Ingest Job

A batch job is a one-off S3 ingest task. It can target a single file, a directory of many files, or a regular expression describing many directories and files matching a pattern. Configure batch notification through the table settings.

Submit a batch job⚓︎

Before submitting a batch job, ensure your batch-peers are scaled appropriately. See scaling your deployment for details.

Submit a batch job with the batch job API.

curl -X POST 'https://my-domain.hydrolix.live/config/v1/orgs/{{org_uuid}}/jobs/batch' \
-H 'Authorization: Bearer thebearertoken1234567890abcdefghijklmnopqrstuvwxyz' \
-H 'Content-Type: application/json' \
-d '{
    "name": "My batch job",
    "description": "A description of my job",
    "type": "batch_import",
    "settings": {
        "source": {
            "table": "website.events",
            "type": "batch",
            "subtype": "aws s3",
            "transform": "mytransformname",
            "settings": {
                "url": "s3://root-bucket/folder1/"
            }
        }
    }
}'
{
      "name": "myjob",
      "description": "my job",
      "uuid": "888ba890-3ece-403d-9753-4edd754bef61",
      "created": "2021-06-03T12:28:41.967285Z",
      "modified": "2021-06-03T12:28:42.260107Z",
      "settings": {
         "max_active_partitions": 576,
         "max_rows_per_partition": 33554432,
         "max_minutes_per_partition": 15,
         "input_concurrency": 1,
         "input_aggregation": 1536000000,
         "max_files": 0,
         "dry_run": false,
         "regex_filter": "",
         "source": {
            "type": "batch",
            "subtype": "aws s3",
            "transform": "mytransform",
            "table": "website.events",
            "settings": {
               "url": "job10"
            }
         }
      },
      "status": "ready",
      "type": "batch_import",
      "org": "d1234567-1234-1234-abcd-defgh123456",
      "details": {
         "errors": [],
         "job_id": "jobid-1234-abcdejhijklm",
         "duration_ms": 115,
         "status_detail": {
            "tasks": {
               "LIST": {
                  "READY": 1
               }
            },
            "estimated": true,
            "percent_complete": 0
         }
      }
}

The status field in the response shows that the job is ready.

Get job status⚓︎

Use the job-status API to check if your job is finished. Hydrolix regularly updates this endpoint with information about running jobs.

curl -X POST 'https://hostname.hydrolix.live/config/v1/orgs/{{org_uuid}}/jobs/batch/{{job_uuid}}/status' \
-H 'Authorization: Bearer thebearertoken1234567890abcdefghijklmnopqrstuvwxyz'
[
{
      "name": "myjob",
      "description": "my job",
      "uuid": "888ba890-3ece-403d-9753-4edd754bef61",
      "created": "2021-07-28T15:16:10.663363Z",
      "modified": "2021-07-28T15:42:56.511741Z",
      "settings": {
         "max_active_partitions": 576,
         "max_rows_per_partition": 20000000,
         "max_minutes_per_partition": 15,
         "input_concurrency": 1,
         "input_aggregation": 1536000000,
         "max_files": 0,
         "dry_run": false,
         "regex_filter": "",
         "source": {
            "type": "batch",
            "subtype": "aws s3",
            "transform": "mytransform",
            "table": "myproject.mytable",
            "settings": {
               "url": "s3://mys3/path/goes/here/"
            }
         }
      },
      "status": "done",
      "type": "batch_import",
      "org": "d1234567-1234-1234-abcd-defgh123456",
      "details": {
         "errors": [],
         "job_id": "jobid-1234-abcdejhijklm",
         "duration_ms": 7194,
         "status_detail": {
            "tasks": {
               "LIST": {
                  "DONE": 1
               },
               "INDEX": {
                  "DONE": 30
               }
            },
            "estimated": false,
            "percent_complete": 1
         }
      }
   }
]

A status of done confirms the job completed successfully.

Cancel a batch job⚓︎

To cancel a batch job, query the cancel job end-point.