Utilities & High-Level Operations
Commands for advanced and utility operations. Find tools for cross-cluster migrations, integrity checks, resource pool management, and resource summaries here.
hdxcli v1.0.83
Migrate
Migrate a table and its dependencies to a target cluster.
This command orchestrates a table migration, which can involve two main stages:
- Resource Creation: Replicates the source project, table, and transforms
on the target cluster. Optionally, it can also migrate associated
functions and dictionaries. - Data Migration: Copies the table's data from the source storage
to the target and updates the catalog to make the data queryable.
Arguments:
SOURCE_TABLE
: The source table to migrate, in 'project.table' format.TARGET_TABLE
: The destination for the migration, in 'project.table' format.
Key Options:
- Target Cluster: Specify the destination with
--target-profile
or with individual
connection details (--target-hostname
,--target-username
, etc.). - Migration Scope (
--only
):- resources: Migrates only the project, table, and other definitions.
- data: Migrates only the data, assuming resources already exist.
- If omitted, a full migration (resources and data) is performed.
- Data Handling:
--reuse-partitions
: For clusters sharing storage. Migrates the table
definition but reuses the existing data, avoiding a data copy.--from-date
/--to-date
: Filter the data to be migrated by a date range.
- Rclone Remote:
--rc-host
,--rc-user
,--rc-pass
: Connection details for the Rclone
server that will perform the data transfer. Required for any migration
that copies data.
Usage
hdxcli migrate [OPTIONS] SOURCE_TABLE TARGET_TABLE
Options
Option | Description |
---|---|
-tp, --target-profile TEXT | Name of the pre-configured profile for the target cluster. |
-h, --target-hostname TEXT | Hostname of the target cluster. |
-u, --target-username TEXT | Username for the target cluster. |
-p, --target-password TEXT | Password for the target cluster. |
-s, --target-uri-scheme [http|https] | URI scheme for the target cluster (http or https). |
--allow-merge | Allow migration even if the source table has the merge process enabled. |
--only [resources|data] | Limit the migration to 'resources' (project, table, etc.) or 'data' (partitions). |
--with-functions | Include functions in the resource migration. |
--with-dictionaries | Include dictionaries in the resource migration. |
--from-date | Minimum timestamp for filtering partitions in YYYY-MM-DD HH:MM:SS format. |
--to-date | Maximum timestamp for filtering partitions in YYYY-MM-DD HH:MM:SS format. |
--reuse-partitions | Reuse existing data partitions instead of copying them. Requires shared storage. |
--rc-host TEXT | The hostname or IP address of the Rclone remote server. |
--rc-user TEXT | The username for authenticating with the Rclone remote server. |
--rc-pass TEXT | The password for authenticating with the Rclone remote server. |
Examples
# Perform a full migration from a staging to a production project, including functions and dictionaries
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --rc-host rclone.host --rc-user rclone.user --rc-pass rclone.pass
# Migrate only the resources (project, table, etc.), without copying data
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --only resources
# Migrate only data for a specific date range, assuming resources already exist
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --only data --from-date "2025-01-01 00:00:00" --rc-host rclone.host --rc-user rclone.user --rc-pass rclone.pass
# Perform a migration between clusters that share the same storage backend, avoiding data copy
hdxcli --profile stage migrate staging_proj.logs prod_proj.logs --target-profile prod --reuse-partitions
Check Health
Check the integrity of transforms and autoviews. This command inspects transforms and autoviews for common integrity issues, such as datatype mismatches.
Usage Scenarios:
- No arguments: Checks all transforms in all projects.
- With
PROJECT_NAME
: Limits the check to a specific project. - With
PROJECT_NAME
andTABLE_NAME
: Limits the check to a single table.
The --repair
flag will attempt to automatically fix any detected issues that are safely repairable.
Usage
hdxcli check-health [OPTIONS] PROJECT_NAME TABLE_NAME
Options
Option | Description |
---|---|
--repair | Attempt to automatically repair detected issues. |
Examples
# Check the entire organization
hdxcli check-health
# Check a specific project and attempt to repair issues
hdxcli check-health my_project --repair
# Check a single table within a project
hdxcli check-health my_project my_table
Shadow
Shadow tables allow safe testing of transform changes by re-ingesting a small data sample from a source table.
Usage
hdxcli shadow [OPTIONS] COMMAND [ARGS]...
Options
Option | Description |
---|---|
--project PROJECT_NAME | Use or override project set in the profile. |
Create
Create a new shadow table. This command creates a new shadow table and a corresponding transform based on a source table and transform. It requires the source context to be specified via options.
Usage
hdxcli shadow create [OPTIONS] TRANSFORM_SETTINGS_PATH
Options
Option | Description |
---|---|
--source-table TEXT | The source table to shadow. [required] |
--source-transform TEXT | The source transform to shadow. [required] |
--sample-rate INTEGER RANGE | Percentage of the original data to be ingested in the shadow table. [0<=x<=5; required] |
--table-name TEXT | Name of the shadow table. Default: shadow_ + 'source-table-name'. |
--table-settings PATH | Path to a file containing settings for the shadow table. |
--transform-name TEXT | Name of the transform for the shadow table. Default: shadow_ + 'source-transform-name'. |
Examples
# Create a shadow table with a 2% sample rate
hdxcli shadow --project my_project create ./new_transform_for_shadow.json --source-table my_table --source-transform my_transform --sample-rate 2
Delete
Delete a shadow table.
This command removes the shadow table settings from the source transform and then deletes the shadow table itself.
Usage
hdxcli shadow delete [OPTIONS] SHADOW_NAME
Options
Option | Description |
---|---|
--disable-confirmation-prompt | Suppress confirmation to delete the shadow. |
Examples
# Delete the shadow table named 'my_shadow_table'
hdxcli shadow --project my_project delete my_shadow_table
Start
Start or update sampling for a shadow table, setting the specified sampling rate on the source transform.
Usage
hdxcli shadow start [OPTIONS] SHADOW_NAME
Options
Option | Description |
---|---|
--sample-rate INTEGER RANGE | Percentage of the original data to be ingested in the shadow table. [1<=x<=5; required] |
Examples
# Start sampling 3% of data for 'my_shadow_table'
hdxcli shadow --project my_project start my_shadow_table --sample-rate 3
Stop
Stop sampling for a shadow table, setting the sampling rate on the source transform to 0.
Usage
hdxcli shadow stop [OPTIONS] SHADOW_NAME
Examples
# Stop all sampling for 'my_shadow_table'
hdxcli shadow --project my_project stop my_shadow_table
Pool
Commands to create, list, and manage resource pools for cluster services.
Usage
hdxcli pool [OPTIONS] COMMAND [ARGS]...
Options
Option | Description |
---|---|
--pool TEXT | Use or override pool set in the profile. |
Create
Allocates resources (CPU, memory, storage) to create a new service pool.
Usage
hdxcli pool create [OPTIONS] POOL_NAME POOL_SERVICE
Options
Option | Description |
---|---|
-r, --replicas INTEGER | Number of replicas for the workload (default: 1). |
-c, --cpu FLOAT | Dedicated CPU allocation for each replica (default: 0.5). |
-m, --memory FLOAT | Dedicated memory allocation for each replica, in Gi (default: 0.5). |
-s, --storage FLOAT | Storage capacity for each replica, in Gi (default: 0.5). |
Examples
# Create a pool named 'my_pool' for the 'query-peer' service
hdxcli pool create my_pool query-peer --replicas 2 --cpu 1 --memory 2 --storage 10
Delete
Delete a specific pool.
This is a permanent action and cannot be undone. You will be prompted for confirmation unless --disable-confirmation-prompt
is used.
Usage
hdxcli pool delete [OPTIONS] POOL_NAME
Options
Option | Description |
---|---|
--disable-confirmation-prompt | Suppress confirmation to delete pool. |
Examples
# Delete the pool named 'my_pool'
hdxcli pool delete my_pool
List
List all available pools.
Retrieves a list of all pools you have access to. Pagination options (--page
, --page-size
) are available if supported by the API.
Usage
hdxcli pool list [OPTIONS]
Options
Option | Description |
---|---|
-p, --page INTEGER | Page number. |
-s, --page-size INTEGER | Number of items per page. |
Examples
# List the first page of pools
hdxcli pool list
Settings
List, get, or set key-value settings for a specific pool.
This command operates in three modes:
- LIST: Invoked with no arguments, it lists all settings.
- GET: Invoked with only a KEY, it retrieves the value of that setting.
- SET: Invoked with a KEY and a VALUE, it sets the value for that setting.
The VALUE can be a string, a number, or a JSON-formatted string for lists/objects. When setting a value, the --force-operation
option may be required for certain resource.
Usage
hdxcli pool settings [OPTIONS] [KEY] [VALUE]
Options
Option | Description |
---|---|
-F, --force | This flag allows adding the force_operation parameter to the request. |
Examples
# List all settings for the pool 'my_pool'
hdxcli pool --pool my_pool settings
# Get the 'name' setting for the pool 'my_pool'
hdxcli pool --pool my_pool settings name
# Set a new 'name' setting for the pool 'my_pool'
hdxcli pool --pool my_pool settings name new_name
Show
Show details for a specific pool.
Retrieves and displays the settings of a single pool. If no name is provided, the default pool will be used if exists.
Usage
hdxcli pool show [OPTIONS] POOL_NAME
Options
Option | Description |
---|---|
-i, --indent | Indent the output. |
Examples
# Show details for the pool named 'my_pool'
hdxcli pool show my_pool
Integration
Commands to manage public integration resources.
Usage
hdxcli integration [OPTIONS] COMMAND [ARGS]...
Transform
Apply pre-built public transforms to your tables.
Usage
hdxcli integration transform [OPTIONS] COMMAND [ARGS]...
Apply
Apply a public integration transform to your project.
This command fetches a public transform by its INTEGRATION_TRANSFORM_NAME
and creates it in your project with the new TRANSFORM_NAME
.
Usage
hdxcli integration transform apply [OPTIONS] INTEGRATION_TRANSFORM_NAME TRANSFORM_NAME
Options
Option | Description |
---|---|
--project TEXT | The project to apply the transform to. [required] |
--table TEXT | The table to apply the transform to. [required] |
Examples
# Apply 'cloudtrail' and name it 'my-ct-transform' for 'my_proj.my_tbl'
hdxcli integration transform apply cloudtrail my-cloudtrail-transform --project my_proj --table my_tbl
List
List available integration transforms.
Usage
hdxcli integration transform list [OPTIONS]
Examples
# List all integration transforms available
hdxcli integration transform list
Show
Show the definition of a public integration transform.
Usage
hdxcli integration transform show [OPTIONS] TRANSFORM_NAME
Options
Option | Description |
---|---|
-i, --indent | Number of spaces for indentation in the output. |
Examples
# Show the JSON definition for the 'cloudtrail' integration transform
hdxcli integration transform show cloudtrail
Query-Option
Manage default query options. This command allows you to list, set, and unset query options that will be applied to all queries within a specific scope.
The scope is determined by the options provided:
- No options: Manages options at the organization level.
--project [NAME]
: Manages options for a specific project.--project [NAME] --table [NAME]
: Manages options for a specific table.
Usage
hdxcli query-option [OPTIONS] COMMAND [ARGS]...
Options
Option | Description |
---|---|
--project TEXT | Target a specific project by name. |
--table TEXT | Target a specific table by name. |
List
List the configured query options for the current scope.
Usage
hdxcli query-option list [OPTIONS]
Examples
# List all query options for the organization
hdxcli query-option list
# List all query options for 'my_project'
hdxcli query-option --project my_project list
Set
Set one or more query options for the specified scope. Options can be set individually using --option
, or in bulk from a JSON file using the --from-file
option.
Usage
hdxcli query-option set [OPTIONS]
Options
Option | Description |
---|---|
--option KEY VALUE | Set a single query option. Can be used multiple times. |
--from-file FILE | Set query options from a JSON file. |
Examples
# Set a single option for project 'my_project'
hdxcli query-option --project my_project set hdx_query_max_rows 5
# Set multiple options in one command
hdxcli query-option set --option hdx_query_timerange_required true --option hdx_query_max_columns_to_read 20
# Set multiple options from a file
hdxcli query-option set --from-file ./options.json
Unset
Unset one or more query options for the specified scope. Unset a single option by name, or unset all options for the current scope by using the --all
flag.
Usage
hdxcli query-option unset [OPTIONS] [QUERY_OPTION_NAME]
Options
Option | Description |
---|---|
--all | Unset all query options for the scope. |
Examples
# Unset a single option for project 'my_project'
hdxcli query-option --project my_project unset hdx_query_max_rows
# Unset all options for the organization
hdxcli query-option unset --all
Show Defaults
Show default settings for various resources. This command retrieves the default configurations from the API for resources such as tables, transforms, sources, and jobs. These defaults are applied when a new resource is created without specifying all its settings.
You can view all defaults at once or filter the output by providing one or more category names (e.g., project, table, transforms) as arguments.
Usage
hdxcli show-defaults [OPTIONS] [CATEGORY]...
Examples
# Display all default settings, grouped by category
hdxcli show-defaults
# Display only the defaults for tables and transforms
hdxcli show-defaults table transforms
Resource Summary
Summarize the count of all resources in the organization. This command provides a quick overview of the total number of projects, tables, transforms, views, and other key resources that the current user has permission to view.
Usage
hdxcli resource-summary [OPTIONS]
Examples
# Display a summary of all resources
hdxcli resource-summary
Updated about 10 hours ago