Pre-transforms for JSON subtype chaining
Pre-transforms are a transform configuration option that allows you to chain JSON subtypes together before parsing. Specify multiple subtypes in your transform's format_details and process data from sources like Amazon CloudWatch logs delivered through Amazon Data Firehose without requiring AWS Lambda functions for preprocessing. Use pre-transforms to unwrap each layer of your data sequentially.
This feature is available in Hydrolix v5.6 and later. UI support for pre-transforms is available in Hydrolix v5.8 and later.
Pre-transform subtypes
Pre-transforms support chaining the following JSON subtypes:
firehose- Unwraps Amazon Data Firehose message formatfirehose/gzip- Unwraps gzip-compressed Amazon Data Firehose messagescloudwatch- Unwraps Amazon CloudWatch log formatmPulse- Unwraps Akamai mPulse data format
Valid subtype combinations
Subtypes must be listed in the order the data needs to be unwrapped. For example, if Firehose wraps Cloudwatch data, use ["firehose", "cloudwatch"], not ["cloudwatch", "firehose"].
Pre-transform lists with more than one subtype must end with either cloudwatch or mPulse as the final subtype.
| Subtype Chain | Description |
|---|---|
["firehose"] | Firehose data only |
["firehose/gzip"] | Compressed Firehose data only |
["cloudwatch"] | CloudWatch logs only |
["mPulse"] | mPulse data only |
["firehose", "cloudwatch"] | CloudWatch logs through Firehose |
["firehose", "firehose/gzip"] | Compressed Firehose wrapped in Firehose |
["firehose", "mPulse"] | mPulse data through Firehose |
["firehose/gzip", "firehose"] | Firehose wrapped in compressed Firehose |
["firehose/gzip", "cloudwatch"] | Compressed CloudWatch logs through Firehose |
["firehose/gzip", "mPulse"] | Compressed mPulse data through Firehose |
["firehose", "firehose/gzip", "cloudwatch"] | CloudWatch through multiple Firehose layers |
["firehose", "firehose/gzip", "mPulse"] | mPulse through multiple Firehose layers |
["firehose/gzip", "firehose", "cloudwatch"] | CloudWatch through compressed and standard Firehose |
["firehose/gzip", "firehose", "mPulse"] | mPulse through compressed and standard Firehose |
["firehose/gzip", "cloudwatch", "mPulse"] | mPulse wrapping CloudWatch through compressed Firehose |
Configure pre-transforms
To create or update a transform with pre-transforms, include the pretransforms array in the format_details section of your transform definition.
The transform in the following example ingests compressed Cloudwatch logs through Firehose.
{
"name": "cloudwatch_firehose_transform",
"type": "json",
"table": "your_table_name",
"settings": {
"output_columns": [
{
"name": "timestamp",
"datatype": {
"type": "datetime",
"format": "2006-01-02T15:04:05.000Z",
"primary": true
}
},
{
"name": "message",
"datatype": {
"type": "string"
}
}
],
"format_details": {
"pretransforms": ["firehose/gzip", "cloudwatch"]
}
}
}
Troubleshooting
Data ingestion fails with "object is missing expected 'records' key"
Cause: Your subtypes are in the wrong order inside your pretransforms array, or you're using an invalid combination.
Solution:
- Verify that
cloudwatchormPulseis the last subtype in your chain. - Ensure the order matches how your data was wrapped (last applied subtype should be first in the array).
- Review the valid combinations above.
Cannot create transform with both "pretransforms" and "subtype"
Cause: The pretransforms and subtype fields are mutually exclusive.
Solution: Remove the subtype field from your transform definition. Use only pretransforms.
Data from mPulse or CloudWatch cannot be further processed
Cause: Once data is unwrapped to mPulse or cloudwatch format, it cannot have additional subtypes applied.
Solution: This is expected behavior. Ensure these subtypes are always last in your pre-transform chain.
Additional resources
Updated about 7 hours ago