Cardinality

count

count calculates the number of rows in a set that have non-null values.

Parameters:

  • expression - the set of rows to count. Expression can be the result of another function or column name.
  • DISTINCT is used to only count the number of unique values rather than every value. Using DISTINCT is optional.

Usage:

  • count(DISTINCT expression)
  • count(expression)

Returns:

  • UInt64 count of rows

uniq, uniqCombined, uniqCombined64

uniq, uniqCombined, and uniqCombined64 all count the approximate number of unique values in a data set, generally exact, however if exact results are needed every time, use uniqExact.

Parameters:

  • data_set - one ore more comma delimited data sets.

Usage:

  • uniq(data_set)
  • uniq(data_set1, data_set2, .... data_setN)
  • uniqCombined(data_set)
  • uniqCombined(data_set1, data_set2, .... data_setN)
  • uniqCombined64(data_set)
  • uniqCombined64(data_set1, data_set2, .... data_setN)

Returns:

  • UInt64 representing the number of unique values in the set. If more than one data set is passed to the function, it will return the cross-product of the uniq values if the two data set have distinct values together. If you combine two data sets that are exactly alike, you will not see a cross-product, but the original value. Results are deterministic.

uniqExact

uniqExact counts the exact number of values in a data set. This is the same functionality as COUNT (DISTINCT data_set)

Parameters:

  • data_set - the data set to calculate the cardinality of.

Usage:

  • uniqExact(data_set)

Returns:

  • UInt64 representing the number of unique values in the set.

Did this page help you?