Set Data Retention Policies
Hydrolix provides policies that automatically manage data retention based on age.
These policies control how long data remains available to query and when data should be permanently deleted.
Data lifecycle management operates as a two-step process involving two services.
-
Decay identifies partitions containing data that has exceeded the configured retention period and marks them as "deactivated" in the Hydrolix catalog. Once deactivated, these partitions are hidden from new queries but remain available to complete any in-flight queries.
-
Reaper permanently deletes partitions that have been deactivated after a specified grace period. It removes these partitions from both the catalog and object storage.
Policies for both services are configured at the table level. This enables you to set different retention policies for different types of data. For example, you might keep security logs for 90 days but retain transaction data for a year.
Table Settings⚓︎
| Parameter | Default | Description |
|---|---|---|
age.max_age_days |
0 (indefinite) | The total number of days data is active/queryable before deactivation |
reaper.max_age_days |
1 | The total number of days data remains hidden after deactivation and before physical deletion |
Important Notes:
age.max_age_daysis always calculated from the primary datetime value of the table, not the ingest timereaper.max_age_daysis always calculated relative to the deactivation date- Some rows may take up to 24 hours longer to expire due to time-based partitioning
- Tables with
age.max_age_days = 0are skipped entirely (no deactivation) - Tables with
reaper.max_age_days = 0are skipped for reaping (indefinite retention of inactive partitions)
Example configuration⚓︎
In this example:
- Data is available for 14 days from the partition's datetime values
- Partitions deactivated on day 14. It is hidden from future queries but remains available for in-flight queries.
- Physical deletion occurs 2 days later