Store Data Across Multiple Buckets

Mix storage provided by different cloud providers, storage with different configurations within the same provider, or both.

Hydrolix supports storing data in multiple different storage types.

Use Cases

There are many reasons why you might use multiple storage buckets in your Hydrolix cluster. Some applications separate data by region. Others segment data by customer. In some cases, certain data requires additional layers of security. Hydrolix supports a wide variety of use cases.

Storage Mapping

You can use storage mapping to shard data stored in a single table across multiple storage buckets. Hydrolix stores data based on literal values within the mapping column. You can provide a list of values that maps to each storage bucket for the given column.

The following example shards data using the column named "US State". This table sorts data into buckets according to the following rules:

  • rows where the "US State" column contains values "New York" or "Colorado" map to "8bc2f07d-cdfc-storage-2"
  • rows where the "US State" column contains values "Oregon" or "New Hampshire" map to "8bc2f07d-cdfc-storage-3"
  • all other rows map to the default storage bucket, "8bc2f07d-cdfc-storage-1"
"my_table": {
  "name": "<table_name>",
  ...
  "storage_map": {
    "default_storage_id": "8bc2f07d-cdfc-storage-1",
    "column_name": "US State",
    "column_value_mapping": {
      "8bc2f07d-cdfc-storage-2": [ "New York", "Colorado" ],
      "8bc2f07d-cdfc-storage-3": [ "Oregon", "New Hampshire" ],
    }
  }
}

You can also configure storage mappings in the Hydrolix Portal UI:

  1. Log into the UI, hosted at https://<hostname>.
  2. Navigate to the Data view by clicking Data in the left sidebar.
  3. Click on the name of the project that contains the table that you want to configure.
  4. Click on the name of the table that you want to configure.
  5. Under Advanced Options, click the three-dot menu to the right of bucket settings.
  6. In the dropdown, click Edit.
  7. Select the column you would like to use to sort data into separate buckets.
  8. In the right sidebar, click Add mapping to configure a storage mapping.
  9. In the Storage ID dropdown, select the ID of the storage where you'd like to store a subset of data.
  10. In Values text entry box, enter the values you would like to map to the storage bucket you just selected. Press space after entering a value to persist it to the list of mapping values.
  11. Configure additional mappings by clicking Add mapping and repeating the previous 3 steps.
  12. Click Save changes to persist your storage mapping settings for the table.

Tradeoffs

Separating data into multiple storage buckets can impact system performance depending on your query patterns and resources.