Aging Data

Data Lifecycle Management

The Age service allows customers to set a Time to Live (TTL) on each table, beyond which older data will be automatically hidden and then deleted.

Age vs Reaper

Lifecycle Management is design as a two step process: First, the background Age service looks for HDX partitions which exclusively contain data older than a specified number of days (age.max_age_days), and automatically marks them as "deactivated" in the Hydrolix catalog. These partitions will no longer be included in any future queries but remain available to service active queries in-flight at the time of deactivation. Please Note: age.max_age_days is always calculated relative to the primary datetime value of the table, not the time when the data was first received by Hydrolix. Additionally, due to the way time-based data is partitioned, some rows may take an additional 24 hours to expire.

Second, the background Reaper service looks for hidden/deactivated partitions and carefully deletes them from object storage after after an additional number of days (reaper.max_age_days). Note: in contrast to Age, reaper.max_age_days is always calculated relative to when the file was marked for deletion, not the primary timestamp.

In the example below, the data is viewable for 14 days and physically deleted after 16 days (T-(14+2)).

Example Configuration settings:

    "name": "my_name",
    "description": "my_desctiption",
    "project": "project_uuid",
      "settings": {
          "age": {
            "max_age_days": 14
          "reaper": {
             "max_age_days": 2
age.max_age_daysThe total number of days the data is available to be viewed from today. For example a value of 10 would mean that data would only be available T-10 days.0 (keep forever)
age.max_age_reaperThe total number of days that data is available before being deleted. For example a value of 2 would mean that data would be delete T-2 days after the max_age_days value1 (delete one day after deactivation)

Did this page help you?