Hydrolix ACME Client
Overview⚓︎
Hydrolix includes software for automatic certificate acquisition and renewal.
This is the default and recommended option for managing TLS in a Hydrolix cluster. See Enable TLS for more TLS options.
The TLS industry has adopted a standard called Automatic Certificate Management Environment (ACME) which can be supported by a Certification Authority (CA).
This page describes how to use the ACME client in a Hydrolix cluster for automated provisioning of TLS certificates.
The Hydrolix implementation automates sourcing certificates from the Let's Encrypt CA using the Lego project.
Prerequisites⚓︎
Before you can start this guide, you'll need the following:
- A deployed Hydrolix cluster
- A publicly reachable IP address on the cluster
- An active DNS record for the cluster's name
Generate a certificate in Hydrolix⚓︎
- Remove the
traefik-tlssecret before proceeding. To prevent loss of an existing certificate, the Hydrolix ACME client won't overwrite an existingtraefik-tlssecret. - Double-check your
hydrolix_urlfor spelling errors, ensuring that you can look up the name in the DNS. - Set the configuration option
acme_enabledtotrueinhydrolixcluster.yaml. - Load the configuration changes to your Hydrolix cluster.
- Hydrolix automatically generates a certificate for your cluster and stores it in a Kubernetes secret named
traefik-tls.
Let's Encrypt with an open allowlist⚓︎
Hydrolix selects the Let's Encrypt CA if your cluster allows inbound access to 0.0.0.0/0 in the ip_allowlist.
Certificates are valid for 90 days. Hydrolix checks weekly for expiration and renews early.
Automatic renewal⚓︎
Hydrolix runs an acme-renewal job which attempts to renew the certificate if the remaining time before expiration is less than 37 days.
Reminder notifications⚓︎
Hydrolix doesn't provide any email notifications of certificate expiration. CAs usually send courtesy mesasges to the admin_email used for acquiring the certificate. This email address should receive notifications of upcoming expiration directly from the CA.
Confirm certificate generation⚓︎
To confirm certificate deployment, run the following command to view details of the certificate.
For example:
See also Enable TLS.
Troubleshoot⚓︎
There are some specific troubleshooting steps when using the Hydrolix acme-client and also common Certificate Troubleshooting tips.
- Check the hostname for spelling errors and correct any mistakes.
- Verify a successful DNS lookup to one or more DNS resolvers for the hostname.
- Confirm from your workstation with the
host $HDX_HOSTNAMEcommand; if it fails, confirm that the authoritative DNS entry is present. - Confirm using a public resolver with the command
host $HDX_HOSTNAME 8.8.8.8. If it fails, diagnose and resolve the authoritative DNS issue. - Confirm the
init-acmejob has run; see Verify the cluster job runs. - Confirm the
acme-renewaljob is scheduled; see Verify Kubernetes cron job runs. -
Confirm that there is a reachable HTTP service inside the Hydrolix cluster:
If this fails, see Verify startup.
The certificate init-acme and acme-renewal jobs both retry upon failure up to six (6) times before stopping. This isn't configurable.
Verify the cluster job runs⚓︎
When the Hydrolix ACME client is enabled, the job init-acme is started. To find the init job, run:
This job may disappear upon success or if the service has been running for more than a day. Hydrolix clears successful job notifications automatically.
Verify Kubernetes cron job runs⚓︎
The acme-renewal cron job, created at startup, routinely runs, checks if renewal is required and performs the renewal operations with the ACME provider if necessary.
To verify, find the acme-renewal job:
Verify startup⚓︎
List the pods to locate the identifier of the init-acme job.
With the pod identifier, examine the logs:
An example of successful initialization:
Consider IP allowlist⚓︎
There are several reasons for unsuccessful initialization, but one failure scenario involves a conflict between the use of an IP allowlist and the ACME provider's verification steps. The symptom appears in the logs of the init-acme job as a Routing not ready error.
The above failure indicates that the test client can't reach the configured hostname in the Hydrolix spec hydrolix_url.
Traffic from the test client leaves the cluster completely, returning as ingress traffic, in order to simulate an incoming ACME challenge request.
After confirming again that the cluster hostname in the hydrolix_url is a publicly visible and valid DNS name, consider temporarily adding 0.0.0.0/0 to the ip_allowlist. See Configure IP Access.
Remove certificate and jobs⚓︎
To wipe the slate clean and start over.
-
Remove the
acme-accountandtraefik-tlssecrets. -
Remove the
init-acmejob. -
Remove the
acme-renewalcronjob.
Hydrolix-generated certificates update automatically
If you're having issues with renewal, please contact Hydrolix Support.