Fastly

Ingest Fastly CDN logs into Hydrolix

Introduction

Fastly is a real-time content delivery network (CDN) that accelerates serving web content and services to end users. Fastly caches and delivers content directly to users rather than fetching it from the origin for every visitor. By caching, optimizing, and securely delivering traffic, Fastly reduces data transfer costs from cloud storage services, making applications more efficient and scalable.

Fastly’s Real‑Time Log Streaming capability allows direct delivery of log data to Hydrolix for storage and analysis. Specifically, Fastly can send structured log events such as request timestamps, client IPs, request paths, and other fields to a Hydrolix cluster's HTTP ingest endpoint.

Similar instructions to these can be found in Fastly's documentation at Log streaming: Hydrolix.

Before you begin

Make sure you have the following set up prior to ingesting Fastly log data into Hydrolix:

Gather the following information:

ItemDescriptionExample valueHow to obtain this information
Cluster ingest URLThis is the hostname and ingest endpoint of your Hydrolix cluster.${HDX_HOSTNAME}/ingest/eventThe value of hydrolix_url in your hydrolixcluster.yaml including the preceding https://.
Table nameThe destination table in the Hydrolix cluster for Fastly CDN logs. Specified in the format: project_name.table_name.fastly_project.fastly_cdn_logsThe Fastly configuration will specify the destination table as a query parameter. See create a table if you need to create a Hydrolix table in which to store your Fastly CDN logs.
Basic authentication credentialsThe user and password to authenticate with the Hydrolix cluster and grant streaming ingest access to the destination table.Username: hydrolix
Password: $TRAEFIK_PASSWORD
See Authentication for Fastly Real-Time Log Streaming for help setting up authentication for Fastly Real-Time Log Streaming. For a more thorough explanation of basic authentication as it pertains to a Hydrolix cluster, see Enable Basic Authentication.

Getting started

There are two steps covered within this document. Following these steps will result in an integrated setup between Fastly and a running Hydrolix cluster:

  1. Create a Hydrolix transform
  2. Configure Fastly real-time log streaming

Make sure to verify after step 1 that the Hydrolix transform exists. You can do so using either method:

  • UI: In the UI by navigating to Data in the left-hand nav, selecting the table from under the Tables tab, and locating the transform name under Table transforms.
  • API: Verify the transform you created is returned as part of the response from the Get transforms endpoint.

If you configure and activate Fastly real-time log streaming prior to creating the destination project, table, and transform in the Hydrolix cluster, the cluster will drop the incoming log data.

Create a Hydrolix transform

See Publishing Your Transform to create and publish a transform using the Fastly transform schema. The transform determines how your Fastly log data will be mapped onto your Hydrolix table.

You have two options for the publishing the transform:

  1. UI: You have the option of registering a transform through the UI using the following:
  • Project name
  • Table name
  • The contents of the output_columns property
  1. API: Alternatively, you can use the API which requires the following:
  • Project ID
  • Table ID
  • The entirety of the Fastly transform json

If the transform you have created is the only one for your table, it will become the default transform for the table. If you have multiple transforms configured for the destination table, and the transform registered for Fastly CDN data isn't the default transform, you should specify the transform for this data as a query parameter in the endpoint URL.

Configure Fastly real-time log streaming

The following steps will set up Hydrolix as a logging endpoint in the Fastly console.

  1. Log in to the Fastly control panel.
  2. From the Home page, select CDN > CDN Services > the service you want to export logs from.
  3. Select the Service configuration tab.
  4. Navigate to Logging > HTTPS > Create endpoint.
  5. Enter the following configuration:

⚠️

Adding or removing fields from the Fastly log format

The Hydrolix transform should contain a column for every field defined in the Fastly log format. If you adjust the Fastly log format to add or remove a field, make sure to add or remove the same column from the Hydrolix transform and vice versa.

If you want to ensure that any fields sent from Fastly that aren't in the Hydrolix transform are aggregated into a single column, use the catch_all keyword. See use the catch-all feature for more information on the catch_all keyword in Hydrolix transforms. If Fastly log streaming omits columns defined in the transform, this results in null values stored for the column.

  • URL: Enter the value for cluster ingest endpoint from Before you begin.
  • Maximum logs: Leave at 0, the default value.
  • Maximum bytes: Leave at 0, the default value.

Expand Advanced options and enter the following:

  • Content type: application/json
  • Custom header name: Authorization
  • Custom header value: The Hydrolix cluster's username and password combination for the Traefik service endpoint. The value for this is Basic hydrolix:TRAEFIK_PASSWORD with TRAEFIK_PASSWORD being an environment variable in the general secret in the Kubernetes namespace for the Hydrolix cluster.
  • Method: POST
  • JSON log entry format: Newline delimited
  • Select a log line format: Leave at Blank, the default value.
  • Compression: Set to None.

Leave the fields under Using your own certificate authority (CA)? blank unless you have manually configured TLS for the Hydrolix server or if your origin-side TLS certificate is signed by a lesser-known CA. See Enable TLS for more information on Hydrolix TLS configuration.

  1. Click Create to create the new logging endpoint.
  2. Click Activate to deploy the configuration changes.

Verification

See Fastly transaction logs for some example queries that interact with the log data using the Hydrolix query interface. The query interface can found at https://${HDX_HOSTNAME}/queries/.

Go further

For a step-by-step tutorial that goes into more detail on Fastly CDN logs and working with this data in Hydrolix, see Work with Fastly CDN Logs.

To view your Fastly log data in Grafana, see Visualizing Fastly Data with Grafana.

Authentication for Fastly Real-Time Log Streaming

The Fastly log streaming configuration uses basic authentication (a username and password) supplied as an HTTP header to the Hydrolix HTTP Stream ingest endpoint to authenticate to the Hydrolix cluster and allow streaming access to the destination table. While the HTTP ingest endpoint can also accept a bearer token for authentication, the Fastly log streaming service limits HTTPS endpoint configuration to one header with a 1024-character limit for the custom header value. Hydrolix user auth tokens are 1482 characters in length.

The basic auth header consists of:

  • Key (Custom header name in the Fastly interface): Authorization
  • Value (Custom header value in the Fastly interface): Basic user:pass where the user is hydrolix, the password is the Kubernetes secret TRAEFIK_PASSWORD, the delimiter is :, and the user:pass contents are base64 encoded.

For example, if the value of the Kubernetes secret TRAEFIK_PASSWORD for the Hydrolix cluster is Pp_13w3gH5UQSQgarbageWkg, you can obtain the correct value to pass in by concatenating the username (hydrolix) with the password (Pp_13w3gH5UQSQgarbageWkg) using a colon (:) delimiter. Running the following command in your shell:

echo -n "hydrolix:Pp_13w3gH5UQSQgarbageWkg" | base64
aHlkcm9saXg6UHBfMTN3M2dINVVRU1FnYXJiYWdlV2tn

The output completes the contents of Custom header value which would be:

Basic aHlkcm9saXg6UHBfMTN3M2dINVVRU1FnYXJiYWdlV2tn