Rate Limits

Hydrolix clusters have a mechanism for rate limiting ingested data at the project, table, or transform levels. Rate limiting can be useful to prevent any single ingest stream from overwhelming a cluster and disrupting other streams of data to the same cluster. Thus, it can prevent noisy neighbor problems within a cluster. Rate limiting is configured when creating or updating these entities. By default, there is no rate limit configured for a project, table, or transform. Configuration for rate limiting looks like the following:

"rate_limit": {
      "limit": "10_000_000",
      "burst": "5_000_000"
    }

📘
Numeric formatting: 10_000_000 or 10000000?
Both 10_000_000 and 10000000 are valid json format and can be supplied to the Hydrolix API directly using cURL. However, some client applications including the Hydrolix API Reference do not accept underscores (10_000_000) and you must instead use the format 10000000.

Both a limit in bytes per second (limit) and a maximum payload size (burst) can be specified. Request handlers inspect the content-length header of a payload to apply rate-limiting constraints.

The burst field may be omitted, in which case it's set equal to limit. If burst is explicitly set to 0, no traffic is allowed through. If burst is set to something less than limit, then it governs the maximum allowed size of a request. Because rate limits can be set at the project, table, or transform level, ingest traffic can be limited for a project, table, and transform if a limit has been exceeded at any one of these layers.

Example Configuration

The following is an example of configuration for a project, table, and two transforms with rate limits.

{
  "name": "example_project",
  "description": "A project with a rate limit of 10Mb per second and a maximum request size of 5Mb",
  "settings": {
    "rate_limit": {
      "limit": "10_000_000",
      "burst": "5_000_000"
    }
  }
}

{
  "name": "example_table",
  "description": "A table with a rate limit of 5Mb per second and a maximum request size of 3Mb",
  "settings": {
    "rate_limit": {
      "limit": "5_000_000",
      "burst": "3_000_000"
    }
  }
}

{
  "name": "transform_low_limit",
  "description": "A transform with a rate limit of 2Mb per second and a maximum request size of 1Mb",
  "settings": {
    "rate_limit": {
      "limit": "2_000_000",
      "burst": "1_000_000"
    }
  }
}

{
  "name": "transform_high_limit",
  "description": "A transform with a rate limit of 4Mb per second and a maximum request size of 4Mb",
  "settings": {
    "rate_limit": {
      "limit": "4_000_000"
    }
  }
}

Note that if a request came in with a content-length of 3.5Mb and the headers:

x-hdx-table example_project.example_table
x-hdx-transform transform_high_limit

this request would be accepted by the transform transform_high_limit but would be rejected by the burst setting for example_table.

Edge Cases

The following are a few edge case examples of rate_limit configuration along with their corresponding behavior.

"rate_limit": {
  "burst": 1_000_000
}

With the above configuration, there would be no limit and no maximum burst size. This is functionally equivalent to supplying no rate_limit setting.

"rate_limit": {
"limit": 1_000_000,
"burst": 0
}

With the above configuration, all traffic would be blocked due to a burst size of 0.

"rate_limit": {
      "limit": 0,
      "burst": 1_000_000
}

With the above configuration, there would be no limit and no maximum burst size. This is functionally equivalent to supplying no rate_limit setting. Limit being unset or zero results in the burst setting being ignored regardless of whether burst is unset, zero, or non-zero

Troubleshooting Rate Limiting

If your requests are being rejected due to a rate limit, you should see the following in the response to the client:

{
  "code": 429,
  "message": {
    "error": "rate exceeded",
    "limiter": "blocked-transform-uuid"
  }
}

Limitations

Limits are applied per request. They can't be applied per row or per event.
Hydrolix ingest guarantees an ingest rate of at most the specified rate limits.
Requests are rejected prior to the Intake Spill or Data Splitter features. As a result, rate limited data won't be queued or split and retried. It's therefore up to the request client to retry any requests that have been rejected due to a rate limit.

📘Numeric formatting: 10_000_000 or 10000000?

Example Configuration

Edge Cases

Troubleshooting Rate Limiting

Limitations

📘
Numeric formatting: `10_000_000` or `10000000`?