Working with Fastly Logging

Hydrolix provides native integration for Fastly log storage. Full integration intructions are available here. Once integrated, you can query and analyse the logs in real-time using Hydrolix. In fact, we’ve done just that with this web site as it is fronted by Fastly. The data is stored in the fastly.logs table.

Checking date time ranges & columns

We switched on log storage mid October 2020, but a quick check can clarify date time ranges and that the data is flowing.


SELECT min(timestamp), max(timestamp), count() 
 FROM fastly.logs
 WHERE (timestamp BETWEEN '2020-01-01 00:00:00' AND now())

That show min(datetime) as 2020-10-11 01:03:59 and max(datetime) as now. We can also Describe the table to check what the column names are.


DESCRIBE fastly.logs
cache_status String					
geo_city String					
geo_country_code String					
host String					
is_tls	UInt64					
request	String					
request_accept_charset	String					
request_accept_language	String					
request_referer	String					
request_user_agent String					
time_elapsed UInt64					
timestamp DateTime					
url String		

Find the top N requested urls per week

The url column represents page requests which can contain query string parameters and also includes references to static assets.

We can clean up the urls at query time, using cutQueryString(url) to strip any query string params and also filter out the static assets using match(string, pattern) to exclude unwanted results. Finally we can group byStartOfWeek(datetime) and topK(N)(column) function to calculate top values. A sub-query will do the filtering, cleaning and time grouping.


SELECT week, topK(10)(path) 
 FROM (
   SELECT toStartOfWeek(timestamp) AS week, cutQueryString(url) AS path 
   FROM fastly.logs  
   WHERE (timestamp BETWEEN '2020-10-11 00:00:00' AND now()) 
   AND  match(url,'^.*\.(css$|json$|js$|svg$|png$|xml|jpg$)') != 1 
   GROUP BY week, path) 
GROUP BY week 

Find the % doc request daily

Our website hosts both our documentation and live query API that executes these SQL examples. The API routes based on a different sub domain (try.hydrolix.io) to our Hydrolix servers.

Previously we found our topN docs, but how much percentage of the traffic is that?

The host column represents which domain the traffic is targetting. It’s a simple task at query time to compare a foreground data set to the full data set. We will use the countIf(column, condition) which is a combinator aggregation function on count(). The condition will be based on startsWith(string, condition). This allows us to conditionally count a foreground subset of the data. The time will be grouped on toStartOfDay(datetime). As this returns a time component by default, we format the output using formatDateTime(datetime, format) to remove the time.


SELECT 
  formatDateTime(toStartOfDay(timestamp), '%D') as day,     
  countIf(startsWith(host,'docs')) / count() * 100 as %percent_docs 
  FROM fastly.logs  
  WHERE (timestamp BETWEEN '2020-10-11 00:00:00' AND now())
  GROUP BY day
  ORDER BY day 

Ok. Get clicking on these live SQL examples and influence the results!

Find the impact of repsonse time v cache status

The time_elapsed column provides an indicator to the responsiveness of our website. In theory, this should be impacted by the cache_status. Lets check.

The quantiles(l1, l2 ..)(column) function can quickly grab multiple percentile quantiles on a single scan. We will check quantiles(0.5, 0.75, 0.99)(time_elasped) and group daily by cache_status.


SELECT 
  formatDateTime(toStartOfDay(timestamp), '%D') AS day, 
  cache_status, quantiles(0.5, 0.75, 0.99)(time_elapsed) 
  FROM fastly.logs  
  WHERE (timestamp BETWEEN '2020-10-11 00:00:00' AND now()) 
  GROUP BY day, cache_status 
  ORDER BY day, cache_status

As expected, cache MISS have a big impact on response times. Although the worse response times are on ERROR.

We recommend monitoring ERROR, possibly modifying the top N query with an addition cache_status filter for example. After doing so, we discover it was the query API endpoint. Removing the cutQueryString function revealed it was due to malformed queries string params.

Yay, for data driven analysis!