Utilities & High-Level Operations

Commands for advanced and utility operations. Find tools for cross-cluster migrations, integrity checks, resource pool management, and resource summaries here.

hdxcli v1.0.83

Migrate

Migrate a table and its dependencies to a target cluster.

This command orchestrates a table migration, which can involve two main stages:

  1. Resource Creation: Replicates the source project, table, and transforms
    on the target cluster. Optionally, it can also migrate associated
    functions and dictionaries.
  2. Data Migration: Copies the table's data from the source storage
    to the target and updates the catalog to make the data queryable.

Arguments:

  • SOURCE_TABLE: The source table to migrate, in 'project.table' format.
  • TARGET_TABLE: The destination for the migration, in 'project.table' format.

Key Options:

  • Target Cluster: Specify the destination with --target-profile or with individual
    connection details (--target-hostname, --target-username, etc.).
  • Migration Scope (--only):
    • resources: Migrates only the project, table, and other definitions.
    • data: Migrates only the data, assuming resources already exist.
    • If omitted, a full migration (resources and data) is performed.
  • Data Handling:
    • --reuse-partitions: For clusters sharing storage. Migrates the table
      definition but reuses the existing data, avoiding a data copy.
    • --from-date/--to-date: Filter the data to be migrated by a date range.
  • Rclone Remote:
    • --rc-host, --rc-user, --rc-pass: Connection details for the Rclone
      server that will perform the data transfer. Required for any migration
      that copies data.

Usage

hdxcli migrate [OPTIONS] SOURCE_TABLE TARGET_TABLE

Options

OptionDescription
-tp, --target-profile TEXTName of the pre-configured profile for the target cluster.
-h, --target-hostname TEXTHostname of the target cluster.
-u, --target-username TEXTUsername for the target cluster.
-p, --target-password TEXTPassword for the target cluster.
-s, --target-uri-scheme [http|https]URI scheme for the target cluster (http or https).
--allow-mergeAllow migration even if the source table has the merge process enabled.
--only [resources|data]Limit the migration to 'resources' (project, table, etc.) or 'data' (partitions).
--with-functionsInclude functions in the resource migration.
--with-dictionariesInclude dictionaries in the resource migration.
--from-dateMinimum timestamp for filtering partitions in YYYY-MM-DD HH:MM:SS format.
--to-dateMaximum timestamp for filtering partitions in YYYY-MM-DD HH:MM:SS format.
--reuse-partitionsReuse existing data partitions instead of copying them. Requires shared storage.
--rc-host TEXTThe hostname or IP address of the Rclone remote server.
--rc-user TEXTThe username for authenticating with the Rclone remote server.
--rc-pass TEXTThe password for authenticating with the Rclone remote server.

Examples

# Perform a full migration from a staging to a production project, including functions and dictionaries
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --rc-host rclone.host --rc-user rclone.user --rc-pass rclone.pass

# Migrate only the resources (project, table, etc.), without copying data
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --only resources

# Migrate only data for a specific date range, assuming resources already exist
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --only data --from-date "2025-01-01 00:00:00" --rc-host rclone.host --rc-user rclone.user --rc-pass rclone.pass

# Perform a migration between clusters that share the same storage backend, avoiding data copy
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --reuse-partitions

Check Health

Check the integrity of transforms and autoviews. This command inspects transforms and autoviews for common integrity issues, such as datatype mismatches.

Usage Scenarios:

  • No arguments: Checks all transforms in all projects.
  • With PROJECT_NAME: Limits the check to a specific project.
  • With PROJECT_NAME and TABLE_NAME: Limits the check to a single table.

The --repair flag will attempt to automatically fix any detected issues that are safely repairable.

Usage

hdxcli check-health [OPTIONS] PROJECT_NAME TABLE_NAME

Options

OptionDescription
--repairAttempt to automatically repair detected issues.

Examples

# Check the entire organization
hdxcli check-health

# Check a specific project and attempt to repair issues
hdxcli check-health my_project --repair

# Check a single table within a project
hdxcli check-health my_project my_table

Shadow

Shadow tables allow safe testing of transform changes by re-ingesting a small data sample from a source table.

Usage

hdxcli shadow [OPTIONS] COMMAND [ARGS]...

Options

OptionDescription
--project PROJECT_NAMEUse or override project set in the profile.

Create

Create a new shadow table. This command creates a new shadow table and a corresponding transform based on a source table and transform. It requires the source context to be specified via options.

Usage

hdxcli shadow create [OPTIONS] TRANSFORM_SETTINGS_PATH

Options

OptionDescription
--source-table TEXTThe source table to shadow. [required]
--source-transform TEXTThe source transform to shadow. [required]
--sample-rate INTEGER RANGEPercentage of the original data to be ingested in the shadow table. [0<=x<=5; required]
--table-name TEXTName of the shadow table. Default: shadow_ + 'source-table-name'.
--table-settings PATHPath to a file containing settings for the shadow table.
--transform-name TEXTName of the transform for the shadow table. Default: shadow_ + 'source-transform-name'.

Examples

# Create a shadow table with a 2% sample rate
hdxcli shadow --project my_project create ./new_transform_for_shadow.json --source-table my_table --source-transform my_transform --sample-rate 2

Delete

Delete a shadow table.

This command removes the shadow table settings from the source transform and then deletes the shadow table itself.

Usage

hdxcli shadow delete [OPTIONS] SHADOW_NAME

Options

OptionDescription
--disable-confirmation-promptSuppress confirmation to delete the shadow.

Examples

# Delete the shadow table named 'my_shadow_table'
hdxcli shadow --project my_project delete my_shadow_table

Start

Start or update sampling for a shadow table, setting the specified sampling rate on the source transform.

Usage

hdxcli shadow start [OPTIONS] SHADOW_NAME

Options

OptionDescription
--sample-rate INTEGER RANGEPercentage of the original data to be ingested in the shadow table. [1<=x<=5; required]

Examples

# Start sampling 3% of data for 'my_shadow_table'
hdxcli shadow --project my_project start my_shadow_table --sample-rate 3

Stop

Stop sampling for a shadow table, setting the sampling rate on the source transform to 0.

Usage

hdxcli shadow stop [OPTIONS] SHADOW_NAME

Examples

# Stop all sampling for 'my_shadow_table'
hdxcli shadow --project my_project stop my_shadow_table

Pool

Commands to create, list, and manage resource pools for cluster services.

Usage

hdxcli pool [OPTIONS] COMMAND [ARGS]...

Options

OptionDescription
--pool TEXTUse or override pool set in the profile.

Create

Allocates resources (CPU, memory, storage) to create a new service pool.

Usage

hdxcli pool create [OPTIONS] POOL_NAME POOL_SERVICE

Options

OptionDescription
-r, --replicas INTEGERNumber of replicas for the workload (default: 1).
-c, --cpu FLOATDedicated CPU allocation for each replica (default: 0.5).
-m, --memory FLOATDedicated memory allocation for each replica, in Gi (default: 0.5).
-s, --storage FLOATStorage capacity for each replica, in Gi (default: 0.5).

Examples

# Create a pool named 'my_pool' for the 'query-peer' service
hdxcli pool create my_pool query-peer --replicas 2 --cpu 1 --memory 2 --storage 10

Delete

Delete a specific pool.

This is a permanent action and cannot be undone. You will be prompted for confirmation unless --disable-confirmation-prompt is used.

Usage

hdxcli pool delete [OPTIONS] POOL_NAME

Options

OptionDescription
--disable-confirmation-promptSuppress confirmation to delete pool.

Examples

# Delete the pool named 'my_pool'
hdxcli pool delete my_pool

List

List all available pools.

Retrieves a list of all pools you have access to. Pagination options (--page, --page-size) are available if supported by the API.

Usage

hdxcli pool list [OPTIONS]

Options

OptionDescription
-p, --page INTEGERPage number.
-s, --page-size INTEGERNumber of items per page.

Examples

# List the first page of pools
hdxcli pool list

Settings

List, get, or set key-value settings for a specific pool.

This command operates in three modes:

  • LIST: Invoked with no arguments, it lists all settings.
  • GET: Invoked with only a KEY, it retrieves the value of that setting.
  • SET: Invoked with a KEY and a VALUE, it sets the value for that setting.

The VALUE can be a string, a number, or a JSON-formatted string for lists/objects. When setting a value, the --force-operation option may be required for certain resource.

Usage

hdxcli pool settings [OPTIONS] [KEY] [VALUE]

Options

OptionDescription
-F, --forceThis flag allows adding the force_operation parameter to the request.

Examples

# List all settings for the pool 'my_pool'
hdxcli pool --pool my_pool settings

# Get the 'name' setting for the pool 'my_pool'
hdxcli pool --pool my_pool settings name

# Set a new 'name' setting for the pool 'my_pool'
hdxcli pool --pool my_pool settings name new_name

Show

Show details for a specific pool.

Retrieves and displays the settings of a single pool. If no name is provided, the default pool will be used if exists.

Usage

hdxcli pool show [OPTIONS] POOL_NAME

Options

OptionDescription
-i, --indentIndent the output.

Examples

# Show details for the pool named 'my_pool'
hdxcli pool show my_pool

Integration

Commands to manage public integration resources.

Usage

hdxcli integration [OPTIONS] COMMAND [ARGS]...

Transform

Apply pre-built public transforms to your tables.

Usage

hdxcli integration transform [OPTIONS] COMMAND [ARGS]...

Apply

Apply a public integration transform to your project.

This command fetches a public transform by its INTEGRATION_TRANSFORM_NAME and creates it in your project with the new TRANSFORM_NAME.

Usage

hdxcli integration transform apply [OPTIONS] INTEGRATION_TRANSFORM_NAME TRANSFORM_NAME

Options

OptionDescription
--project TEXTThe project to apply the transform to. [required]
--table TEXTThe table to apply the transform to. [required]

Examples

# Apply 'cloudtrail' and name it 'my-ct-transform' for 'my_proj.my_tbl'
hdxcli integration transform apply cloudtrail my-cloudtrail-transform --project my_proj --table my_tbl

List

List available integration transforms.

Usage

hdxcli integration transform list [OPTIONS]

Examples

# List all integration transforms available
hdxcli integration transform list

Show

Show the definition of a public integration transform.

Usage

hdxcli integration transform show [OPTIONS] TRANSFORM_NAME

Options

OptionDescription
-i, --indentNumber of spaces for indentation in the output.

Examples

# Show the JSON definition for the 'cloudtrail' integration transform
hdxcli integration transform show cloudtrail

Query-Option

Manage default query options. This command allows you to list, set, and unset query options that will be applied to all queries within a specific scope.

The scope is determined by the options provided:

  • No options: Manages options at the organization level.
  • --project [NAME]: Manages options for a specific project.
  • --project [NAME] --table [NAME]: Manages options for a specific table.

Usage

hdxcli query-option [OPTIONS] COMMAND [ARGS]...

Options

OptionDescription
--project TEXTTarget a specific project by name.
--table TEXTTarget a specific table by name.

List

List the configured query options for the current scope.

Usage

hdxcli query-option list [OPTIONS]

Examples

# List all query options for the organization
hdxcli query-option list

# List all query options for 'my_project'
hdxcli query-option --project my_project list

Set

Set one or more query options for the specified scope. Options can be set individually using --option, or in bulk from a JSON file using the --from-file option.

Usage

hdxcli query-option set [OPTIONS]

Options

OptionDescription
--option KEY VALUESet a single query option. Can be used multiple times.
--from-file FILESet query options from a JSON file.

Examples

# Set a single option for project 'my_project'
hdxcli query-option --project my_project set hdx_query_max_rows 5

# Set multiple options in one command
hdxcli query-option set --option hdx_query_timerange_required true --option hdx_query_max_columns_to_read 20

# Set multiple options from a file
hdxcli query-option set --from-file ./options.json

Unset

Unset one or more query options for the specified scope. Unset a single option by name, or unset all options for the current scope by using the --all flag.

Usage

hdxcli query-option unset [OPTIONS] [QUERY_OPTION_NAME]

Options

OptionDescription
--allUnset all query options for the scope.

Examples

# Unset a single option for project 'my_project'
hdxcli query-option --project my_project unset hdx_query_max_rows

# Unset all options for the organization
hdxcli query-option unset --all

Show Defaults

Show default settings for various resources. This command retrieves the default configurations from the API for resources such as tables, transforms, sources, and jobs. These defaults are applied when a new resource is created without specifying all its settings.

You can view all defaults at once or filter the output by providing one or more category names (e.g., project, table, transforms) as arguments.

Usage

hdxcli show-defaults [OPTIONS] [CATEGORY]...

Examples

# Display all default settings, grouped by category
hdxcli show-defaults

# Display only the defaults for tables and transforms
hdxcli show-defaults table transforms

Resource Summary

Summarize the count of all resources in the organization. This command provides a quick overview of the total number of projects, tables, transforms, views, and other key resources that the current user has permission to view.

Usage

hdxcli resource-summary [OPTIONS]

Examples

# Display a summary of all resources
hdxcli resource-summary