Scaling using TOML Configuration

Hydrolix cluster scaling can be configured in two different ways. The first is through the use of the scale command listing the components you wish to scale - more information can be found in the HDXCTL reference.

The second is using a configuration file. The configuration file is accessible via two commands
To print out the contents of configuration to STDOUT.

hdxctl scale CLIENTID CLUSTERID --emit-toml

or

Edit the scaling TOML directly using vi and update the cluster when finished.

hdxctl scale CLIENTID CLUSTERID --edit

Using --edit

The edit option downloads the scale config file and allows the user to edit it via vi. Once the user has committed the change, the configuration is updated on the closing of the vi session

Using --emit-toml

When using emit-toml, you can write the output to a file.

hdxctl scale CLIENTID CLUSTERID --emit-toml > FILENAME

Then to load it back and update the scaling.

hdxctl scale CLIENTID CLUSTERID --from-file FILENAME

For example:

$ hdxctl scale hdxcli-123456 hdx-abcdef --emit-toml > myscale.TOML

$ vi myscale.TOML

..... make changes.....

$ hdxctl scale hdxcli-123456 hdx-abcdef --from-file FILENAME

TOML Structure

The TOML structure for the configuration utilizes a hierarchy where defaults are specified with then service pools and individual components inheriting settings or overriding them with their own specific settings.

$ hdxctl scale hdxcli-123456 hdx-abcdef --emit-toml
[service.head.instance]
family = "c5n"
size = "9xlarge"
count = "1"
disk = 30
spot = false
cache_disk = 0

[service.stream-head.instance]
family = "r5"
size = "xlarge"
count = "1"
disk = 30
spot = false
cache_disk = 0

[service.rds.instance]
family = "db.r5"
size = "large"
disk = 30

.........

Section names within the TOML are specified as follows:

Configuration

Description

service.default.instance

Default configuration used for any undefined service

pool_default.batch-peer.instance

Default configuration used for Batch peer instances.

pool_default.kafka-peer.instance

Default configuration used for Kafka peer instances.

pool_default.merge-peer.instance

Default configuration used for Merge peer instances.

pool_default.peer.instance

Default configuration used for Query peer instances.

pool_default.stream-peer.instance

Default configuration used for Stream peer instances.

service.bastion.instance

Configuration used for the Bastion instance for the cluster

service.config.instance

Configuration used for the Config API service for the cluster.

service.grafana.instance

Configuration used for Grafana instances.

service.head.instance

Configuration used for the Query Head instances.

service.intake-misc.instance

Configuration used for the intake-misc instances.

service.prometheus.instance

Configuration used for Prometheus instances.

service.rds.instance

Configuration used for RDS instances used for the catalog server. Be wary

service.stream-head.instance

Configuration used for the Stream Head instances.

service.superset.instance

Configuration used for Superset instances.

service.ui.instance

Configuration used for the User Interface service for the cluster.

service.web.instance

Configuration used for the Web instance for the cluster.

service.zookeeper.instance

Configuration used for Zookeeper instances.

pool.batch-peer0.instance

Configuration for a batch-peer pool.

pool.kafkapool-UUID.instance

Configuration for a Kafka-peer pool.

pool.merge-peer0.instance

Configuration for a merge-peer pool.

pool.query-peer0.instance

Configuration for a query-peer pool.

pool.stream-peer0.instance

Configuration for a stream-peer pool.

Each service and pool has the following key value pairs.

Parameter

Description

Example

service

The type of service component to be configured

"kafka-peer", "peer", "merge-peer"

count

How many instances to be used for a service. If auto scaling is to be used (query-peer only) either a min or max can be provided (e.g. 2-5) or a min, desired and max can be specified e.g. (2-5-10).

1, 2-5, 2-5-10

family

Family for the instances to be deployed

r5, c5n, t5.

disk

The size of the EBS volume to be utilized by the instance.

30

spot

If the cluster should use spot instances (query-peer only).

true or false.

cache_disk

The size of the cache_disk to be utilized (query only).

30

on_demand_percent

Specifies the number of on-demand % that should be used when using spot instances.

Note that AWS rounds up so any on_demand_percent greater than 0 but below 34 will give you one instance in that case.


Did this page help you?