Pagination Change in v5.4.0

Description of pagination, a breaking change to the Config API

Overview

The Config API supports fully-featured, consistent pagination across endpoints starting in Hydrolix v5.4.

All services inside a Hydrolix cluster are already migrated to the new pagination styles described below.

Two patterns of pagination are now in use in the Config API:

  • Page number: most endpoints
  • Cursor: catalog and data management, endpoints that can return millions objects

For catalog and partition management endpoints, which can track millions of objects, cursor-style pagination is necessary. These endpoints are primarily used inside the cluster for data management.

This page describes the API behavior change for all changed endpoints. Unaffected endpoints are listed, too. The Hydrolix Config API OpenAPI specification includes detail on all pagination parameters. Migration guidance is available below.

Page number pagination

Starting in Hydrolix v5.4.0, endpoints with variable result set sizes return a paginated response.

Most endpoints in the Config API for managing resources use the common industry pattern of page number pagination. In each request, API clients request one page of results and communicate both the page size and page number to the server.

Page number pagination is used for all objects with unbounded size. These resources are always returned in paginated responses.

Prior to this change, not all endpoints returned paginated results.

Pagination improves product scalability, preventing large result sets from introducing CPU, memory, and network bottlenecks for all applications handling the results.

Using page number pagination

Requests can include HTTP query parameters page (default 1) and page_size (default 100, maximum 1000).

Page numbers begin at page 1, which is why the default is 1.

An example request: /config/v1/users/?page=2&page_size=300

This example annotated response shows the results returned by Hydrolix, now including important pagination context:

{
  "results": [ ... ],    # Array of items
  "count": int,          # total records
  "current": int,        # this page num
  "num_pages": int,      # total pages
  "next": int,           # next page number or 0 if no next
  "previous": int,       # prev page number or 0 if no prev
}

Prior responses contained only the results:

[ ... ],                 # Array of items

Changed endpoints

The endpoint /config/v1/auth_logs/ implements pseudo-pagination, as no proper count is available from the Keycloak system.

Migration guidance

Option A

Not all clusters contain enough objects to exceed the maximum page size. Not all clients require all objects or resources.

In these cases, clients can use available results, which will be the same array of objects that were previously returned.

Change: Adapt the client to extract the results object alone. Optionally set page_size to a manageable number.

Option B

Introduce pagination to the affected clients. Page numbers begin at 1.

Change: Introduce pagination to the client. Add a query parameter page initialized at 1. Increment page for each request to the endpoint. Accumulate results until the next variable is 0.

Option C

Suppress pagination.

Change: Add query parameter page_size=0.

🚧

Will be removed in a future release

Pagination on these endpoints is important for availability and scalability for the Hydrolix cluster and applications. While this query parameter retains the original behavior of the endpoints, it will be removed.

Cursor pagination

All catalog and partition management endpoints now use cursor pagination. These endpoints are crucial for managing the catalog and interacting with the underlying Hydrolix data storage.

The first request to any endpoint supporting cursor pagination should omit the cursor query parameter. Subsequent requests can set the cursor to the value in the next or previous to move through the result set. The page_size controls the maximum number of results in the response.

Catalog endpoints were inconsistently paginated, earlier. They relied on a combination of page number pagination and cursor pagination. Cursor pagination is faster, well suited for volatile resources, and retains ordering and position.

Most Hydrolix users never use these endpoints and do not need to take any action.

Note: Customers using the Hydrolix Connector for Apache Spark should upgrade to v2.0.0-5.1.1 or a later release, which is pagination-aware.

Using cursor pagination

All requests can include the page_size (default 1000, max 10000) HTTP query parameter. After the first request, subsequent requests must include HTTP query parameter cursor. The cursor value indicates position and direction in the result set. Example:

/config/v1/orgs/:org_id/catalog/?page_size=4000&cursor=XYZ

Responses look like this:

{
  "results": [ ... ],    # Array of items
  "next":     string,    # next page link or None if no next page
  "previous": string,    # prev page link or None if no previous page
}

Cursors are opaque Base64-encoded strings. If the value of previous is cj0xJnA9NDg3, then supplying query parameters like cursor=cj0xJnA9NDg3 return the previous set of results.

Changed endpoints

These endpoints now use cursor instead of page number pagination.

One endpoint supported cursor pagination prior to v5.4.

Migration guidance

Instead of building the next or previous endpoint from the page number, use the value of next or previous to get the next set of values. For the cursor-paginated endpoints, the page_size has a maximum of 10000.

There is no ability to suppress cursor pagination.

Unaffected endpoints

Endpoints which return data of reasonably bounded results or constant size do not support pagination.

Unchanged endpoints