Viewing system logs
The components of your Hydrolix cluster make their various logs available through various AWS services, as well as direct access.
Hydrolix components based on EC2 instances, such as SQL query handlers, regularly copy their system and application logs into their associated S3 bucket. You may alternatively view these logs directly by logging into these components via SSH.
AWS Lambda-based components, such as the temporary processes that that manage batch ingestion, add their logs into CloudWatch instead.
Hydrolix’s catalog-data component, as a special case, publishes its logs to AWS RDS.
Logs in S3
You can find your cluster’s system and other plain-text logs at the path
/CLIENT_ID/logs/ within the S3 bucket that the cluster uses. For example, if your Hydrolix Client ID is “
hdxcli-example123”, then you would find all the system logs from all the clusters associated with that ID at
/hdxcli-example123/logs/ within the S3 bucket you use with Hydrolix.
The log directory contains a number of sub-directories named after dates, in
YYYY-MM-DD format. Each of these sub-directories contains all the log files that Hydrolix wrote during that day, in UTC time.
Browsing S3 log files
Each Hydrolix log file contains exactly one minute of system log entries from your cluster’s components. Hydrolix names these files according to a pattern that lets you filter for logs written during a particular hour or minute, or from a certain component:
LOG_TYPEis the type of information being logged, with
journaldfor components’ operating system logs, and other values for logs written by the components’ Hydrolix software. See S3 log filename prefixes for a full list.
INSTANCE_IDis the AWS instance ID of the component that generated this log.
YYYY-MM-DD-HHMMidentifies the specific minute that the log file covers, in UTC time.
So, for example, the file named
head-i-0df413e635f4fc35b-2021-04-20-2131.log.gz would contain the Hydrolix-specific logs for a query-head component from 9:31 PM UTC time on April 20, 2021.
(We can also see that the query head’s AWS instance ID was “
i-0df413e635f4fc35b”, but this information is rarely relevant while browsing logs.)
.gz extension implies, Hydrolix compresses all these files in Gzip format.
S3 log filename prefixes
Hydrolix’s S3 log files have prefixes that identify their purpose or component.
For a more complete exploration of these various components, see Hydrolix Components.
|File prefix||Log type||Component type|
|batch-peer||Application||A batch-ingestion worker.|
|head||Application||A query head.|
|journald||System||Any component type.|
|kafka-peer||Application||A Kafka ingestion worker.|
|merge-peer||Application||A merge service worker.|
|peer||Application||A query worker.|
|stream-head||Application||A stream-intake head.|
|stream-peer||Application||A stream-intake worker.|
Note that every EC2-based component creates
journald logs, even if they also produce component-specific logs. Components that create only
journald system logs include the Bastion, Zookeeper, and UI components.
Logs via SSH
As an alternative to browsing logs copied regularly onto S3, you can log into components via SSH in order to browse log files directly. Doing this requires that you configure your deployment for SSH access, as detailed on the page Accessing components with SSH.
Every component writes its system logs to the path
/var/log/. You can find service-specific logs in the following paths, on their respective components.
Logs in CloudWatch
Hydrolix components and services that run as temporary AWS Lambda instances write their logs as data into AWS CloudWatch. While you can use this data to power sophisticated monitoring dashboards within the CloudWatch web UI, you can also simply browse these logs as plain text, organized by Hydrolix service and sorted by date. This guide will focus on this simpler use-case.
To see your cluster’s CloudWatch logs, select Logs > Log groups from CloudWatch’s left navigation bar. The resulting list contains one entry for every CloudWatch-using service across all your clusters. Hydrolix names each of these log groups according to a pattern that identifies the service writing it:
CLUSTER_IDis the Hydrolix ID of the cluster this component belongs to.
LOG_TYPEis a brief text tag identifying this service’s role within your cluster. See CloudWatch log group suffixes for a full list.
Click on any log group to see all the individual logs it contains, sorted by date and time.
CloudWatch log group suffixes
Hydrolix’s CloudWatch log groups have suffixes that identify each one’s associated Hydrolix service.
|Log group suffix||Service type|
|autoingest||An auto-ingestion service.|
|batch-ingest-api||A batch-ingestion API handler.|
|batch-ingest-head||A batch-ingestion process head.|
|decay||Part of the data-aging service.|
|merge-head||Part of the data-merging service.|
|reaper||Part of the data-aging service.|
Logs in RDS
The PostgreSQL database that powers your Hydrolix cluster’s metadata catalog stores its own logs on AWS RDS.
To view these logs, follow these steps.
- Select Databases from RDS’s left navigation bar.
- On the resulting screen, select the DB indentifier that matches your Hydrolix Client ID.
- Finally, select the Logs & events tab.
The Logs section of the page you arrive at lists your catalog component’s logs, sorted by time. You may select any of these logs and click the View button to see that log’s text.
Other logs and metrics
Hydrolix features a number of ways to view your cluster’s ongoing operations, including a built-in Prometheus metrics database.
For more help with viewing your system logs, please contact Hydrolix support.