Configuration Options Reference

Hydrolix Cluster Specification

There are many different options available in hydrolixcluster.yaml spec that control the behavior of your Hydrolix cluster.

SettingDefault ValueDefinitionExample Values
acme_enabledfalseWhether or not the ACME certificate challenge server is enabled.
admin_email"[email protected]"Admin email that receives a password once the cluster starts.
aws_credentials_method"static"The credentials method used to authenticate with AWS. Deprecated: Use db_bucket_credentials_method.
aws_load_balancer_tags"Environment=dev,Team=test"Additional tags added to the load balancer of the traefik service when running in EKS.
aws_load_balancer_subnets"service.beta.kubernetes.io/aws-load-balancer-subnets"A list of subnets assigned to the load balancer of the traefik service when running in EKS.
azure_blob_storage_accountnullThe storage account to use when accessing an Azure blob storage container.
basic_auth[]A list of Hydrolix services that should be protected with basic auth when accessed. For more information, see Enabling Access & TLS. This is incompatible with the unified_auth setting, which was on by default starting with the 4.6 release.
batch_peer_heartbeat_period"5m"How frequently a batch peer should heartbeat any task it's working on as a duration string
catalog_db_admin_user"turbine"When PostgreSQL is managed externally, the admin user of the Catalog.
catalog_db_admin_db"turbine"The default db of the admin user of the Catalog.
catalog_db_host"postgres"Hostname used by the external PostgreSQL instance serving as the Hydrolix instance Catalog.
catalog_db_portnullThe PostgreSQL catalog port.
catalog_intake_connections"max_lifetime": "10m", "max_idle_time": "1m"Catalog database connection pool settings for intake services. Available options: (1) 'max_lifetime' - the max duration that a connection can live before being recycled. (2) 'max_idle_time' - the max duration that a connection can be idle before being closed. (3) 'max' - the max number of connections that can be opened by each intake service that connects to the database. (4) 'min' - the minimum number of connections to keep open to the database. (5) 'check_writable' - if true, then a connection is opened to the database to ensure it can handle writes.
clickhouse_http_port8088The port to dedicate to the ClickHouse HTTP interface.
db_bucket_credentials_method"web_identity"The method Hydrolix uses to acquire credentials for connecting to cloud storage."static", "ec2_profile", "web_identity"
db_bucket_endpointnullThe endpoint url for S3 compatible object storage services.
Not required if using AWS S3 or if db_bucket_url is provided.
db_bucket_namenullThe name of the bucket you would like Hydrolix to store data in.
Not required if db_bucket_url is provided.
db_bucket_regionnullThe region of the bucket you would like Hydrolix to store data in.
Not required if it can be inferred from db_bucket_url.
"us-east-2", "us-central1"
db_bucket_typenullThe object storage type of the bucket you would like Hydrolix to store data in.
Not required if db_bucket_url is provided.
"gs", "s3"
db_bucket_urlnullThe URL of the cloud storage bucket you would like Hydrolix to store data in."gs://my-bucket",
"s3://my-bucket",
"https://my-bucket.s3.us-east-2.amazonaws.com",
"https://s3.us-east-2.amazonaws.com/my-bucket",
"https://my-bucket.us-southeast-1.linodeobjects.com",
"https://minio.local/my-bucket"
db_bucket_use_httpstrueIf true use https when connecting to the cloud storage service. Inferred from db_bucket_url if possible.
decay_batch_size5000Number of entries to fetch for each request to the catalog.
decay_enabledtrueWhether or not the Decay CronJob should run.
decay_max_deactivate_iterationsMaximum number of deactivation iterations to execute per table.
decay_max_reap_iterationsMaximum number of reap iterations to execute per table.
decay_reap_batch_size5000Number of entries to fetch for each request when locating entries for reaping.
decay_schedule0 0 * * *CRON schedule for Decay CronJob.
default_query_pool"query-peer"A name for the default query pool.
disable_disk_cachefalseWhether or not to disable disk caching.
disable_traefik_http_portfalseWhether or not to disable the default HTTP port (80) used by the traefik service.
disable_traefik_https_portfalseWhether or not to disable the default HTTPS port (443) used by the traefik service.
disable_traefik_native_portfalseWhether or not to disable the default Native (TCP) Clickhouse port (8088) used by the traefik service.
disable_traefik_clickhouse_http_portfalseWhether or not to disable the default Clickhouse HTTP port used by the traefik service. Uses port 9000 when TLS is disabled; 9440 when TLS is enabled.
disable_vector_bucket_loggingfalseWhether or not to disable bucket logging via vector.
disable_vector_kafka_loggingfalseWhether or not to disable Kafka logging via vector.
disk_cache_entry_max_ttl_minutes360Max TTL for a cache disk entry. It is the longest period of time for which the LRU disk cache can save an entry before it expires.
dns_server_ipThe IP address of a DNS server used for performance critical purposes.
dns_aws_max_resolution_attemptsThe maximum number of attempts made by the DNS resolver for AWS and all S3-compatible storages in a given DNS refresh cycle.
dns_aws_max_ttl_secsThe maximum DNS TTL for AWS and S3-compatible storages. It is the longest period of time the DNS resolver can cache a DNS record before it expires and needs to be refreshed. If 0, the DNS cache will strictly respect the TTL from the DNS query response.
dns_azure_max_resolution_attemptsThe maximum number of attempts made by the DNS resolver for Azure storages in a given DNS refresh cycle
dns_azure_max_ttl_secsThe maximum DNS TTL for Azure storages. It is the longest period of time the DNS resolver can cache a DNS record before it expires and needs to be refreshed. If 0, the DNS cache will strictly respect the TTL from the DNS query response
dns_gcs_max_resolution_attemptsThe maximum number of attempts made by the DNS resolver for GCS storages in a given DNS refresh cycle
dns_gcs_max_ttl_secsThe maximum DNS TTL for GCS storages. It is the longest period of time the DNS resolver can cache a DNS record before it expires and needs to be refreshed. If 0, the DNS cache will strictly respect the TTL from the DNS query response
enable_manifest_cachetrueIf true, query heads will cache manifests downloaded from the database bucket.
enable_query_authfalseWhen enabled requests to the query service, URL paths starting with /query and TCP native queries require authentication.
For more information, see Query Authentication.
enable_traefik_hstsfalseIf set to true, Traefik will enforce HSTS on all its connections WARNING: This may lead to hard-to-diagnose persistent SSL failures if there are any errors in SSL configuration, and cannot be turned off later
enable_vectornullRun vector to send kubernetes pod logs to json files in a bucket. Default inferred from the value of scale_off.
env{}Environment variables to set on all Kubernetes pods that are part of the Hydrolix cluster, useful to specify AWS Key for example.
force_container_user_rootfalseSet the initial user for all containers to 0 (root).
http_connect_timeout_ms300Maximum time to wait for socket to complete connection to cloud storage.
http_portNoneThe port for non-HTTPS connections. This setting is rarely needed sincehydrolix_url contains this information.
http_ssl_connect_timeout_ms1000Maximum time to wait for SSL/TLS handshake completion during connection to cloud storage.
http_response_timeout_ms1000Maximum time to wait for HTTP headers while reading from cloud storage.
http_read_timeout_ms1000Maximum time to wait, after socket read (recv system call), for cloud storage to make data available.
http_write_timeout_ms10000Maximum time to spend uploading a single partition to cloud storage.
https_portNoneThe port for HTTPS connections. This setting is rarely needed sincehydrolix_url contains this information.
hydrolix_name"hdx"The name you would like to assign your Hydrolix cluster.
hydrolix_urlnullThe URL you would like to use to access your Hydrolix cluster."https://my-host.hydrolix.live",
"https://my-host.mydomain.com",
"http://my-host.local"
intake_head_accept_data_timeout0sConfigures the maximum duration that aintake-head pod will wait for a request to be accepted into the partition creation pipeline. If the timeout is reached, the request will be rejected with a 429 status code response. If not configured or set to 0, intake-head pods will not timeout.
intake_head_index_backlog_enabledfalseEnables the indexing backlog feature: rather than blocking intake requests until storage is ready, the intake-head will keep a backlog of requests when storage is overwhelmed. If enabled, the newest data received will indexed ahead of older data when the backlog grows. See other intake_head_index_backlog_* options below.
intake_head_index_backlog_max_accept_batch_size50Controls the maximum number of buckets accepted from ingestion and added to the backlog at a time. Only applicable if intake_head_index_backlog_enabled is true.
intake_head_index_backlog_max_mb256Controls the maximum size in MB that the indexing backlog on intake-head is allowed to grow before either dropping data or slowing new entries. Only applicable if intake_head_index_backlog_enabled is true.
intake_head_index_backlog_purge_concurrency1Controls the number of workers used to purge buckets from the intake-head backlog when the max size is breached. Only applicable if intake_head_index_backlog_enabled is true.
intake_head_max_outstanding_requests0Configures the maximum number of requests that an intake-head pod will allow to be outstanding and in process before rejecting new requests with a 429 status code response. If not configured or set to 0, intake-head pods will never reject new requests.
ip_allowlist["127.0.0.1/32"]A list of CIDR ranges that should be allowed to connect to the Hydrolix cluster load balancer. For more information, see Enabling Access & TLS.
job_purge_age2160hThe age in days of a job after which it should be deleted from PostgreSQL Batch / Alter. For more information, see Batch Ingest.
job_purge_enabledtrueWhether or not the Job Purge CronJob should run.
job_purge_schedule0 2 * * *CRON schedule for Job Purge CronJob.
kafka_careful_modefalseWhether or not to rate-limit Kafka reads to 10 messages per read call and limit concurrency to 1 reader. By default, Hydrolix reads 50 messages per call across 32 concurrent readers.
kafka_tls_ca""A CA certificate used by the kafka_peer to authenticate with a Kafka server. For more information, see Ingest via Kafka.
kafka_tls_cert""The PEM format certificate the kafka_peer uses to authenticate with a Kafka server. For more information, see Ingest via Kafka.
kafka_tls_keynullThe PEM format key the kafka_peer uses to authenticate with a Kafka server. For more information, see Ingest via Kafka.
kinesis_coordinate_period10sFor Kinesis sources, how often the coordination process runs which distributes shard consumption amongst available peers. For more information, see ingest via Kinesis.
kubernetes_premium_storage_classnullThe storage class to use with persistent volumes created in Kubernetes for parts of a Hydrolix cluster where throughput is most critical. If set to default-kubernetes-storage-class, it will use your cloud provider's default storage class.
kubernetes_profilegenericUse default settings appropriate to this type of Kubernetes deployment."gke", "eks"
kubernetes_storage_classnullThe storage class to use with persistent volumes created in Kubernetes as part of a Hydrolix cluster. If set to default-kubernetes-storage-class, it will use your cloud provider's default storage class.
kubernetes_version"1.22"Make manifests compatible with this version of Kubernetes.
limit_cputrueIf set, container CPU limits are set to match CPU requests in Kubernetes..
logs_http_remote_table "hydro.logs"An existing Hydrolix <project.table> where the log data should be stored within a remote cluster's object store."my_project.my_table"
logs_http_remote_transform"megaTransform"The transform schema to use for log data ingested into the remote cluster's object store."another_transform_name"
logs_http_table"hydro.logs"An existing Hydrolix <project.table> where the log data should be stored in the local cluster's object store."my_project.my_table"
logs_kafka_bootstrap_servers["redpanda"]A comma separated list of Kafka bootstrap servers to send logs to.
logs_kafka_topic"logs"Kafka topic that Hydrolix sends logs to.
logs_sink_local_url"http://stream-head:8089/ingest/event"The full URI to send local HTTP requests containing log data.
logs_sink_remote_url""The full URI to send remote HTTP requests containing log data."https://{remote_hdx_host}/ingest/event"
"https://{remote_hdx_host}/pool/stream-head"
"https://{remote_hdx_host}/pool/{custom-ingest-pool}/ingest/event"
logs_http_transform"megaTransform"The transform schema to use for log data ingested into the local Hydrolix cluster's object store. "another_transform_name"
logs_sink_remote_auth_enabledfalseWhen enabled, remote HTTP will use basic authentication from a curated secret. Note that enabling this option requires providing basic auth via the environment variables LOGS_HTTP_AUTH_USERNAME and LOGS_HTTP_AUTH_PASSWORD.true, false
logs_sink_type "kafka"This determines the sink to send log data to on its way to an object store. The value "kafka" actually places the log data on the cluster's redpanda queue, not a kafka queue."kafka", "http"
log_levelAllows modification of logging levels for individual Hydrolix services. See Log Level for more.
log_vacuum_concurrency8Number of concurrent log deletion processes.
log_vacuum_dry_runfalseIf true, LogVacuum will only log it's intentions and take no action.
log_vacuum_enabledtrueWhether or not the Log Vacuum CronJob should run.
log_vacuum_max_age2160hMaximum age of a log file before it is removed from cloud storage expressed as a duration string.
log_vacuum_schedule0 4 * * *CRON schedule for Log Vacuum CronJob.
max_http_retries3The number of retries Hydrolix attempts for a failed HTTP request. These retries happen in addition to the original request, so the total maximum number of requests for value N is always N+1.
merge_cleanup_batch_size5000Number of entries to fetch for each request to the catalog.
merge_cleanup_delay15mHow long before a merged partition should be deleted expressed as a duration string.
merge_cleanup_enabledtrueWhether or not the Merge Clean-up CronJob should run.
merge_cleanup_schedule*/5 * * * *CRON schedule for Merge Clean-up CronJob.
merge_head_batch_size10000Number of records to pull from the catalog per request by the Merge head.
merge_interval"15s"The time the merge process waits between checking for Mergeable partitions.
merge_max_partitions_per_candidate100The maximum number of partitions per Merge candidate.
merge_min_mb1024Size in megabytes of the smallest merge tier. All other merge tiers are multiples of this value.
monitor_ingestfalseRuns a pod which sends a heartbeat event through ingest to the hydro.montor table once per second. Tests entire intake->storage->query path when paired with an alert on that table.
monitor_ingest_timeout1HTTP timeout (in seconds) for POSTs from monitor_ingest
native_port9000The port on which to serve the ClickHouse plaintext native protocol.
native_tls_port9440The port on which to serve the ClickHouse TLS native protocol.
oom_detectionConfiguration options for detecting indexing OOM scenarios and retry with smaller data sizes if possible for services that perform ingest. Outer keys are names of the ingest services. The supported services are intake-head, kafka-peer, kinesis-peer, and akamai-siem-peer. Available keys under each service are k8s_oom_kill_detection_enabled, k8s_oom_kill_detection_max_attempts, circuit_break_oom_detection_enabled, and preemptive_splitting_enabled. See this page for details.
otel_endpointnullSend OTLP data to the HTTP server at this URL.
overcommitfalseWhen true, turn off memory reservations and limits for Kubernetes pods. Useful when running on a single node Kubernetes cluster with constrained resources.
partition_cleaner_grace_period24hSets a minimum age for a partition before it's deactivated or deleted.
Expressed as a duration string.
partition_vacuum_batch_size10000Number of entries to fetch from object storage or the catalog on each request.
partition_vacuum_concurrency5Number of concurrent vacuum operations to run. Each vacuum operation covers a single table at a time.
partition_vacuum_dry_runtrueIf true, Partition Vacuum will only log its intentions and take no action.
partition_vacuum_enabledtrueWhether or not the Partition Vacuum CronJob should run.
partition_vacuum_grace_period24hMinimum age of a partition before it is considered for deactivation or deletion expressed as a duration string.
partition_vacuum_schedule0 1 * * *Cron schedule for Partition Vacuum CronJob.
pg_ssl_mode"disable"Determines whether and with what priority an SSL connection will be negotiated when connecting to a Postgres server. See here for more details."disable", "require", "verify-ca", "verify-full"
prune_locks_enabledtrueWhether or not the Prune Locks CronJob should run.
prune_locks_grace_period"24h"Minimum age of a lock before it is considered for removal expressed as a duration string.
prune_locks_schedule30 0 * * *CRON schedule for Prune Locks CronJob.
prometheus_remote_write_urlNoneURL of external Prometheus server
prometheus_remote_write_username"hdx"User name for external Prometheus server authentication
prometheus_retention_timeWhen to remove old prometheus data. Example: 15d
prometheus_retention_sizeThe maximum number of bytes of prometheus data to retain.
Units supported: B, KB, MB, GB, TB, PB, EB
prometheus_scrape_interval"15s"This sets Prometheus' default scrape_interval value, setting the overall frequency of metric scraping.
poolsnullA list of dictionaries describing pools to deploy as part of the Hydrolix cluster.
See here for more details.
query_peer_liveness_check_path(appropriate path)The HTTP path used to configure a kubernetes liveness check for query-peers. Set to 'none' to disable.
query_readiness_initial_delay0The time in seconds to wait before starting query readiness checks.
registry"public.ecr.aws/l2i3s2a2"A docker registry to pull Hydrolix containers from.
rejects_vacuum_dry_runfalseIf enabled, the Rejects Vacuum CronJob will not delete files, but instead log files it would have deleted.
rejects_vacuum_enabledtrueWhether or not the Rejects Vacuum CronJob should run.
rejects_vacuum_max_age168hHow old a rejects file should be before deleted, expressed as a duration string (e.g. 1h5m4s).
rejects_vacuum_period_mins180How often (in minutes) to run the Vacuum Rejects job.
rejects_vacuum_schedule0 0 * * *CRON schedule for Reject Vacuum CronJob.
s3_endpoint""The endpoint used to communicate with S3.
sample_data_url""The storage bucket URL used to load sample data.
scalenullA list of dictionaries describing overrides for scale related configuration for Hydrolix services
For more information, see Scaling your Cluster.
scale_offfalseWhen true, override all deployment and statefulset replica counts with a value of 0 and disable vector
Scale your Hydrolix deployment to 0. For more information, see Scaling your Cluster.
scale_profile"eval"Selects from a set of predefined defaults for scale.
sdk_timeout_sec300How many seconds Merge should be given to run before it is killed.
silence_linode_alertsfalseDisables email notifications triggered by violations of Notification Thresholds for a Linode Compute Instance. For more information, see Linode's Resource Usage Email Alerts.
skip_init_turbine_apifalseSkips running database migrations in the init-turbine-api job. Set to true when running multiple clusters with a shared database.
stale_job_monitor_batch_size300How many jobs to probe in a single request.
stale_job_monitor_enabledtrueWhether or not the Statel Job Monitor CronJob should run.
stale_job_monitor_limit3000How many jobs in total StaleJob will process per cycle.
str_dict_enabledtrueEnable/disable multi-threaded string dictionary decoding.
str_dict_nr_threads8Sets the maximum number of concurrent vCPU used for decoding.
str_dict_min_dict_size32768Controls the number of entries in each string dictionary block.
stream_concurrency_limitnullThe number of concurrent stream requests per CPU, allocated across all pods, beyond which traefik will return 429 busy error responses. If not set or set to null, no limit is enforced.
stream_load_balancer_algorithmround-robinThe load balancer algorithm to use with stream-head and intake-head services.round-robin, least-connections-p2c
stream_partition_block6The number of partitions to use on a non-default redpanda stream topic per TB/day of usage.
stream_partition_count50The number of partitions for the internal redpanda topic used by the Stream Ingest service.
stream_replication_factor3The replication factor for the internal Redpanda topic used by the Stream Ingest service.
targetingnullSpecify target node where the hydrolix resources can run in the k8s cluster.
See here for more details.
task_monitor_enabledtrueWhether or not the Task Monitor CronJob should run.
task_monitor_heartbeat_timeout600How old a tasks heartbeat should be (in seconds) before it is timed out.
task_monitor_schedule*/2 * * * *CRON schedule for Task Monitor.
task_monitor_start_timeout21600How old a ready task should be (in seconds) before it is considered lost and timed out.
traefik_hsts_expire_time315360000Expiration time for HSTS caching in seconds.
traefik_keep_alive_max_time26The number of seconds a client HTTP connection can be reused before recieving a Connection: close response from the server. Zero means no limit.
traefik_service_annotationsnullCreates traefik annotations as a dictionary.
Allows additional traefik service annotations.
annotations: external-dns.alpha.kubernetes.io/hostname: nginx.example.com
traefik_service_type"public_lb",
"private_lb",
"node_port",
"cluster_ip"
Specifies the type of Load-balancer to use for the cluster. Default is to use a public load-balancer. If private_lb is chosen and the cluster is running on eks or gke an internal load-balancer will be provisioned. node_port can be used for a customer provided ELB or cluster_ip to not set up anything related to exposing the service outside of the Kubernetes cluster itself.
turbine_api_init_poolsfalseIf enabled, the turbine-api component initializes some pools.
unified_authtrueEnforce centralized API authentication for ingest endpoints and other endpoints served by traefik. Unified Authentication has been true by default since the 4.6 release. It is incompatible with basic_auth.
use_https_with_s3trueWhether or not to use HTTPS when communicating with S3.
use_hydrolix_dns_resolvertrueIf true, use Hydrolix DNS resolver. If false, use system resolver.
use_crunchydata_postgresfalseUse a postgres managed by Crunchydata's postgres operator instead of the default dev mode postgres.
vector_bucketnullBucket where Vector should save JSON format pod logs.
vector_bucket_path"logs"Prefix under which Vector saves pod logs.

Kubernetes Secrets

Some Hydrolix settings requires sensitive data, including passwords or secrets:

VariableDefault ValueDescription
ROOT_DB_PASSWORDnullThe admin password of the PostgreSQL server used as the Hydrolix Catalog.
AWS_SECRET_ACCESS_KEYnullAWS service account secret key used to connect to AWS and external AWS services.
AZURE_ACCOUNT_KEYnullAzure secret key used to connect to Azure blob storage.
TRAEFIK_PASSWORDrandomDefault password when basic_auth is enabled. For more information, see IP Access and TLS.