Data Types

The following data types can be used in transforms. Regarding integers, it's not super important to find the smallest available width. The benefits are based on storage size and compression.

A couple of datatypes. boolean and epoch, translate to more primitive types and are therefore just shorthand expressions to make it easier to describe in a transform. All columns regardless of type can also be null or missing a value except for the primary date time field.

We index all data types except doubles by default. This means storing all unique values in a lookup style data structure for speed of data access. It's transparent to the user and you can switch off per column if needed.


TypeDescriptionDefault Indexed
 datetimeA string-based representation of a moment in time, e.g. Mon Jan 2 15:04:05 -0700 2006.Yes
epochConverted to a datetime or datetime64, using this mapping requires additional formatting information; see [Formatting timestamps](gu/docs/timestamps-1Yes
doubleA 64-bit floating-point number.No
int8A signed 8-bit integer (-128 : 127).Yes
int32A signed 32-bit integer (-2147483648 : 2147483647).Yes
int64A signed 64-bit integer (-9223372036854775808 : 9223372036854775807).Yes
 stringA variable-length string. Equivalent to VARCHAR or CLOB in other data systems.Yes
uint8An unsigned 8-bit integer (0 : 255).Yes
uint32An unsigned 32-bit integer (0 : 4294967295).Yes
uint64An unsigned 64-bit integer (0 : 18446744073709551615).Yes
booleanConverted to a uint8 prior to storage. The case-insensitive strings "false" or "0" get converted to 0. Any other non-0 value gets converted to 1.Yes
arrayAn array of any one of the primitive types that Hydrolix supports.yes unless double

Notes about datetime

Note that you must pair any datetime columns with time-format information, specified elsewhere--such as within the column definitions of an ingest schema.

If you set a datetime or epoch column to use millisecond resolution, then you may see Hydrolix-generated view schemas refer to this column as "datetime64". This special primitive applies to views only, and simply indicates a high-resolution datetime column.

The following example defines a column named "index_data" whose value is an array of integers.

"output_columns": [
        {
            "name": "timestamp",
            "datatype": {
                "type": "datetime",            
                "primary": true,
                "format": "2006-01-02 15:04:05 MST"
            }
        },
        {
            "name": "index_data",
            "datatype": {
                "type": "array",
                "elements": [
                    {
                        "type": "uint64"
                    }
                ]
            }   
        }
    ]
}

Did this page help you?