Prerequisites
To follow along with this tutorial, you will need the following:
- A running Hydrolix cluster with access
- Python3
- HDXCLI, a command line tool that allows you to do admin tasks on Hydrolix
- A copy of starter project in github here - folder nginx_web_access_logs used extensively
Hydrolix allows you to store high-volume time-series data at low cost for use cases like CDN logs, security logs, web server logs, and sensor logs. Some companies are using Hydrolix to store 1 to 5 TBs a day for a full year.
Cloud storage is used to index the data, so once your install is complete, the bulk of project work will typically consist of doing the following:
- Modeling how data is stored
- Ingesting data
- Querying data
- Adding advanced features such as dashboards and real-time aggregations
- Optimizing for high volume performance and lower costs
In this tutorial, you will learn the basics of modeling, ingesting, and querying data. You'll use a NGINX access log and try out the provided example by downloading the sample code and following along. You are encouraged to experiment and explore!
Updated about 2 months ago