Prerequisites

To follow along with this tutorial, you will need the following:

A running Hydrolix cluster with access
Python3
HDXCLI, a command line tool that allows you to do admin tasks on Hydrolix
A copy of starter project in github here - folder nginx_web_access_logs used extensively

Hydrolix allows you to store high-volume time-series data at low cost for use cases like CDN logs, security logs, web server logs, and sensor logs. Some companies are using Hydrolix to store 1 to 5 TBs a day for a full year.

Cloud storage is used to index the data, so once your install is complete, the bulk of project work will typically consist of doing the following:

Modeling how data is stored
Ingesting data
Querying data
Adding advanced features such as dashboards and real-time aggregations
Optimizing for high volume performance and lower costs

In this tutorial, you will learn the basics of modeling, ingesting, and querying data. You'll use a NGINX access log and try out the provided example by downloading the sample code and following along. You are encouraged to experiment and explore!

Updated 9 months ago