Hydrolix Spec Configuration Validator

Automate validation for your Hydrolix cluster configuration

Overview

As of Hydrolix v5.4.0, Hydrolix clusters implement a validating webhook for the hdx spec configuration by default. This webhook will verify that any changes applied to the spec configuration are valid, and will either issue warnings or will refuse to update the cluster depending on its configuration.

The default behavior for the validator is Ignore mode (default). This means the operator logs any configuration errors as warnings, but still applies any changes.

Modify the configuration validator

This webhook can be modified or disabled when installing operator resources. You can enable, update, or disable the validating webhook using the CLI query parameter validation-policy, which is explained further below.

First, set the following environment variables:

export HDX_NAMESPACE={hdx-namespace}
export HDX_VERSION={hdx-cluster-version}

There are 3 different modes for the webhook which incorporate different failure policies:

Ignore mode (default)

Run the following command to install the ValidatingWebhookConfiguration alongside the operator. This uses the validation-policy query parameter to:

  1. Start up a webhook pod if one is not already running
  2. Patch the ValidatingWebhookConfiguration resource with Failure Policy: Ignore.
curl https://hkw.hydrolix.live/$HDX_VERSION/operator-resources?namespace=$HDX_NAMESPACE&validation-policy=ignore | kubectl apply -f -

Verify ignore mode configuration

Run the following to pull down the ValidatingWebhookConfiguration manifests:

kubectl describe validatingwebhookconfigurations

This returns a Kubernetes object manifest which includes the following:

Name: hdx-validator-${HDX_NAMESPACE}
Kind: ValidatingWebhookConfiguration
Metadata:
  Failure Policy:  Ignore

The failure policy set to Ignore indicates that the validation-policy change has been applied.

You can also make some invalid changes to the hdx spec configuration. Using these invalid tunables:

spec:
  kubernetes_namespace: my-namespace
  overcommit: make-it-so
  scale_profile: development

Add the invalid tunables to the Hydrolix cluster config using this command:

kubectl -n $HDX_NAMESPACE edit hdx hdx

Which should result in operator logs like the following:

Warning: Ignoring unrecognized tunable 'kubernetes_namespace'
Warning: Unsupported value 'make-it-so' for tunable 'overcommit', expected one of: ['requests', 'limits', 'all', 'both', 'true', 'True', 'false', 'False'] (ignored error)
Warning: Unsupported value 'development' for tunable 'scale_profile', expected one of: ['dev', 'prod', 'ci', 'bench', 'eval', 'mega'] (ignored error)
hydrolixcluster.hydrolix.io/hdx edited

The validation error logs any errors thrown for a HydrolixCluster object as warnings. The handler applies any changes, including those that resulted in an error.

Fail mode

With fail mode enabled, if the spec contains a tunable for which there is a defined list of valid options, and the value provided in the spec is not in that list, the operator will log this as an error and fail to apply the set of changes to the cluster.

To enable fail mode, run the following command. Doing so installs the ValidatingWebhookConfiguration alongside the operator. This uses the validation-policy query parameter to:

  1. Start up a webhook pod if one is not already running
  2. Patch the ValidatingWebhookConfiguration resource with Failure Policy: Fail.
curl https://hkw.hydrolix.live/$HDX_VERSION/operator-resources?namespace=$HDX_NAMESPACE&validation-policy=fail | kubectl apply -f -

Verify fail mode configuration

Run the following to pull down the ValidatingWebhookConfiguration manifests:

kubectl describe validatingwebhookconfigurations

This returns a Kubernetes object manifest which includes the following:

Name: hdx-validator-${HDX_NAMESPACE}
Kind: ValidatingWebhookConfiguration
Metadata:
  Failure Policy:  Fail

The failure policy set to Fail indicates that the validation-policy change has been applied.

You can also make some invalid changes to the hdx spec configuration. Using these invalid tunables:

spec:
  kubernetes_namespace: my-namespace
  overcommit: make-it-so
  scale_profile: development

Add the invalid tunables to the Hydrolix cluster config using this command:

kubectl -n $HDX_NAMESPACE edit hdx hdx

Which should result in operator logs like the following:

Warning: Ignoring unrecognized tunable 'kubernetes_namespace'
error: hydrolixclusters.hydrolix.io "hdx" could not be patched: admission webhook "validate.hydrolix.io" denied the request:
* Unsupported value 'make-it-so' for tunable 'overcommit', expected one of: ['requests', 'limits', 'all', 'both', 'true', 'True', 'false', 'False']
* Unsupported value 'development' for tunable 'scale_profile', expected one of: ['dev', 'prod', 'ci', 'bench', 'eval', 'mega']
You can run `kubectl replace -f /var/folders/mj/3blarg1337lh0ht2_24s4mc0000gn/T/kubectl-edit-952407693.yaml` to try this update again.

If any errors are captured in the validation handler, changes to the hdx spec are not allowed and the reasons are displayed to the user. Correct errors and try re-applying the configuration for changes to apply to the cluster.

No validation mode

To disable validation, run the following command:

curl https://hkw.hydrolix.live/$HDX_VERSION/operator-resources?namespace=$HDX_NAMESPACE&validation-policy==none | kubectl apply -f -

If validation was previously enabled, the above will not automatically remove the existing validation-related resources. Those can be cleaned up with:

kubectl delete validatingwebhookconfiguration hdx-validator-$HDX_NAMESPACE
kubectl -n $HDX_NAMESPACE delete service hdx-webhooks

If you need to re-enble the validator later, you can use either of the commands in the ignore mode (default) or fail mode sections.

Verify that the validator is disabled

Try making some invalid changes to the hdx spec configuration:

kubectl -n $HDX_NAMESPACE edit hdx hdx

For instance, you can try adding these invalid tunables:

spec:
  kubernetes_namespace: my-namespace
  overcommit: make-it-so
  scale_profile: development

Which should result in operator logs like the following:

hydrolixcluster.hydrolix.io/hdx edited

This configuration results in the following:

  • The operator does not start a webhook server
  • The operator does not log any warnings or errors related to invalid configuration