Pagination Change in v5.4.0
Description of pagination, a breaking change to the Config API
Overview
The Config API supports fully-featured, consistent pagination across endpoints starting in Hydrolix v5.4.
All services inside a Hydrolix cluster are already migrated to the new pagination styles described below.
Two patterns of pagination are now in use in the Config API:
- Page number: most endpoints
- Cursor: catalog and data management, endpoints that can return millions objects
For catalog and partition management endpoints, which can track millions of objects, cursor-style pagination is necessary. These endpoints are primarily used inside the cluster for data management.
This page describes the API behavior change for all changed endpoints. Unaffected endpoints are listed, too. The Hydrolix Config API OpenAPI specification includes detail on all pagination parameters. Migration guidance is available below.
Page number pagination
Starting in Hydrolix v5.4.0, endpoints with variable result set sizes return a paginated response.
Most endpoints in the Config API for managing resources use the common industry pattern of page number pagination. In each request, API clients request one page of results and communicate both the page size and page number to the server.
Page number pagination is used for all objects with unbounded size. These resources are always returned in paginated responses.
Prior to this change, not all endpoints returned paginated results.
Pagination improves product scalability, preventing large result sets from introducing CPU, memory, and network bottlenecks for all applications handling the results.
Using page number pagination
Requests can include HTTP query parameters page
(default 1
) and page_size
(default 100
, maximum 1000
).
Page numbers begin at page 1
, which is why the default is 1
.
An example request: /config/v1/users/?page=2&page_size=300
This example annotated response shows the results returned by Hydrolix, now including important pagination context:
{
"results": [ ... ], # Array of items
"count": int, # total records
"current": int, # this page num
"num_pages": int, # total pages
"next": int, # next page number or 0 if no next
"previous": int, # prev page number or 0 if no prev
}
Prior responses contained only the results:
[ ... ], # Array of items
Changed endpoints
/config/v1/auth_logs/
/config/v1/activity/
/config/v1/invites/
/config/v1/orgs/
/config/v1/orgs/{org_id}/credentials/
/config/v1/orgs/{org_id}/jobs/alter/
/config/v1/orgs/{org_id}/jobs/batch/
/config/v1/orgs/{org_id}/projects/
/config/v1/orgs/{org_id}/projects/{project_id}/activity
/config/v1/orgs/{org_id}/projects/{project_id}/dictionaries/
/config/v1/orgs/{org_id}/projects/{project_id}/dictionaries/{dictionary_id}/activity/
/config/v1/orgs/{org_id}/projects/{project_id}/functions/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/activity/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/columns/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/sources/kafka/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/sources/kinesis/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/sources/siem/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/transforms/
/config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/views/
/config/v1/orgs/{org_id}/storages/
/config/v1/orgs/{org_id}/storages/{id}/tables/
/config/v1/roles/
/config/v1/service_accounts/
/config/v1/tasks/
/config/v1/tasks/events/
/config/v1/transform_templates/
/config/v1/users/
The endpoint /config/v1/auth_logs/
implements pseudo-pagination, as no proper count is available from the Keycloak system.
Migration guidance
Option A
Not all clusters contain enough objects to exceed the maximum page size. Not all clients require all objects or resources.
In these cases, clients can use available results
, which will be the same array of objects that were previously returned.
Change: Adapt the client to extract the results
object alone. Optionally set page_size
to a manageable number.
Option B
Introduce pagination to the affected clients. Page numbers begin at 1
.
Change: Introduce pagination to the client. Add a query parameter page
initialized at 1
. Increment page
for each request to the endpoint. Accumulate results until the next
variable is 0
.
Option C
Suppress pagination.
Change: Add query parameter page_size=0
.
Will be removed in a future release
Pagination on these endpoints is important for availability and scalability for the Hydrolix cluster and applications. While this query parameter retains the original behavior of the endpoints, it will be removed.
Cursor pagination
All catalog and partition management endpoints now use cursor pagination. These endpoints are crucial for managing the catalog and interacting with the underlying Hydrolix data storage.
The first request to any endpoint supporting cursor pagination should omit the cursor
query parameter. Subsequent requests can set the cursor
to the value in the next
or previous
to move through the result set. The page_size
controls the maximum number of results in the response.
Catalog endpoints were inconsistently paginated, earlier. They relied on a combination of page number pagination and cursor pagination. Cursor pagination is faster, well suited for volatile resources, and retains ordering and position.
Most Hydrolix users never use these endpoints and do not need to take any action.
Note: Customers using the Hydrolix Connector for Apache Spark should upgrade to v2.0.0-5.1.1 or a later release, which is pagination-aware.
Using cursor pagination
All requests can include the page_size
(default 1000, max 10000) HTTP query parameter. After the first request, subsequent requests must include HTTP query parameter cursor
. The cursor
value indicates position and direction in the result set. Example:
/config/v1/orgs/:org_id/catalog/?page_size=4000&cursor=XYZ
Responses look like this:
{
"results": [ ... ], # Array of items
"next": string, # next page link or None if no next page
"previous": string, # prev page link or None if no previous page
}
Cursors are opaque Base64-encoded strings. If the value of previous
is cj0xJnA9NDg3
, then supplying query parameters like cursor=cj0xJnA9NDg3
return the previous set of results.
Changed endpoints
These endpoints now use cursor instead of page number pagination.
- /config/v1/orgs/{org_id}/catalog/
- /config/v1/orgs/{org_id}/projects/{project_id}/tables/{table_id}/partitions/
- /config/v1/orgs/{org_id}/storages/{storage_id}/catalog/
One endpoint supported cursor pagination prior to v5.4.
Migration guidance
Instead of building the next
or previous
endpoint from the page number, use the value of next
or previous
to get the next set of values. For the cursor-paginated endpoints, the page_size
has a maximum of 10000
.
There is no ability to suppress cursor pagination.
Unaffected endpoints
Endpoints which return data of reasonably bounded results or constant size do not support pagination.
Unchanged endpoints
- /config/v1/orgs/{org_id}/projects/{project_id}/dictionaries/files/
- /config/v1/invites/bulk_invites
- /config/v1/auth/identity_providers/
- /config/v1/pools
- /config/v1/pools/ingest_endpoints/
- /config/v1/job_statuses/
- /config/v1/dictionary_layouts/
- /config/v1/dictionary_input_formats/
- /config/v1/roles/permissions
Updated 10 days ago