Timeseries Functions

Timeseries Functions⚓︎

SQLCURL

SELECT count()
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-02 12:00:00')

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT count()
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-02 12:00:00')
EOF

Date Time can be specified as absolute value '2016-01-01 00:00:00', relative value now(), yesterday() or calculated using Date Time arithmetic add(), subtract() and interval functions

group time⚓︎

Grouping and subdivision of time ranges is achieved through the use of time functions (reference). These functions can be used in either SELECT expression or GROUP BY clause position.

There are general time functions for toStartOfYear(), toStartOfMonth(), toStartOfDay(), toStartOfMinute(), toStartOfSecond(). Along with common patterns toStartofFiveMinute(), Ten and Fifteen and many more. Each function accepts a Date Time data type and rounds values to their respective time unit.

SQLCURL

SELECT toStartOfFiveMinute(timestamp) AS five, max(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 04:00:00')
  GROUP BY five
  ORDER BY five

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT toStartOfFiveMinute(timestamp) AS five, max(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 04:00:00')
  GROUP BY five
  ORDER BY five
EOF

This will aggregate the data into 5 minutes buckets and compute max values of the specified metric. If the time buckets aren't required in the result, we could move the function into the GROUP BY clause.

SQLCURL

SELECT max(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 04:00:00')
  GROUP BY toStartOfMinute(timestamp) AS five
  ORDER BY five

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT max(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 04:00:00')
  GROUP BY toStartOfMinute(timestamp) AS five
  ORDER BY five
EOF

Fine grained time grouping⚓︎

Fine grained grouping can be achieved using toStartOfInterval(t, INTERVAL unit) where unit can be year|month|day|minute

SQLCURL

SELECT toStartOfInterval(timestamp, INTERVAL 3 hour) AS three, max(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-02 00:00:00')
  GROUP BY three
  ORDER bY three

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT toStartOfInterval(timestamp, INTERVAL 3 hour) AS three, max(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-02 00:00:00')
  GROUP BY three
  ORDER bY three
EOF

use time functions together⚓︎

Time functions can be used together for more complex calculations.

SQLCURL

SELECT toStartOfHour(timestamp) AS hour, count()
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 02:00:00')
  GROUP BY hour, toStartOfInterval(timestamp, INTERVAL 30 minute) AS mins
  ORDER BY hour, mins

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT toStartOfHour(timestamp) AS hour, count()
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 02:00:00')
  GROUP BY hour, toStartOfInterval(timestamp, INTERVAL 30 minute) AS mins
  ORDER BY hour, mins
EOF

Equally sized time slots⚓︎

timeSlots(t, duration, size) is useful for defining equally sizes slots. Where duration is specified in seconds as UInt32 (type conversion) and size controls the number of slots contained in duration.

SQLCURL

SELECT timeSlots(timestamp, toUInt32(60*10), 6*10) AS slots, max(usage_guest)
  FROM sample.metrics
  WhERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 01:00:00')
  GROUP BY slots
  ORDER BY slots

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT timeSlots(timestamp, toUInt32(60*10), 6*10) AS slots, max(usage_guest)
  FROM sample.metrics
  WhERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 01:00:00')
  GROUP BY slots
  ORDER BY slots
EOF

This produces 600 second slots, each containing an array of time in 60 second intervals.

Time aggregations⚓︎

Finding the latest value of a metric is a common monitoring requirement. argMax(metric,t) succinctly expresses finding the latest value over a specified time range.

SQLCURL

SELECT argMax(usage_guest, timestamp)
  FROM sample.metrics
  WHERE (timestamp between '2016-01-01 00:00:00' AND '2016-01-01 12:00:00')

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT argMax(usage_guest, timestamp)
  FROM sample.metrics
  WHERE (timestamp between '2016-01-01 00:00:00' AND '2016-01-01 12:00:00')
EOF

Rate of change over time⚓︎

Calculating the rate of change over time is an important concept in time series analytics. boundingRatio(t,metric) will calculate rate change based on metric argMax - argMin / max - min time. Adding a grouping time function will give you a breakdown per time bucket.

SQLCURL

SELECT team, boundingRatio(timestamp, usage_iowait) AS rate
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 01:00:00')
  GROUP BY team
  ORDER BY rate DESC

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT team, boundingRatio(timestamp, usage_iowait) AS rate
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 01:00:00')
  GROUP BY team
  ORDER BY rate DESC
EOF

Range of quantiles⚓︎

Calculating a range of quantiles for a single metric in a single pass is also easy usingquantiles(L1, L2, ..)(metric). Where levels are specified as a number between 0 and 1.

SQLCURL

SELECT quantiles(0.25, 0.5, 0.75, 0.99)(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 12:00:00')

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
SELECT quantiles(0.25, 0.5, 0.75, 0.99)(usage_guest)
  FROM sample.metrics
  WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 12:00:00')
EOF

Aggregations on groups of data⚓︎

<aggregation>Resample(start, end, step)(<aggFunction_params>, resampling_key) lets you divide data into groups, and then separately aggregates the data in those groups. Groups are created by splitting the values from one column into intervals.

SQLCURL

SELECT countResample(80,90,2)(usage_guest, usage_guest)
   FROM sample.metrics
   WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 12:00:00')

curl --data-binary @- https://try3.hydrolix.live/query <<EOF
 SELECT countResample(80,90,2)(usage_guest, usage_guest)
   FROM sample.metrics
   WHERE (timestamp BETWEEN '2016-01-01 00:00:00' AND '2016-01-01 12:00:00')
EOF

This groups the metric values [80, 82, 84, 86, 88, 90] and provides counts for each bucket. Very useful!