Login

We store events in multiple tables depending on category.
Each event have an id but contains multiple subelements.
We have a lookup table to find events using the subelement_id.
Each subelement can participate at max in 7 events.
Hence the partition will hold max 7 rows.
We will have 30-50 BILLIONS of rows in eventlookup over a period of 5 years.

CREATE TABLE eventlookup (
subelement_id text,
recordtime timeuuid,
event_id text,
PRIMARY KEY ((subelement_id), recordtime)
)

Problem: How do we delete old data once we reach the 5 (or some other number) year mark.
We want to purge the "tail" at some specific intervals, say every week or month.

Approaches investigated so far:

- TTL of X years (performs well, but TTL needs to be known before hand, 8 extra bytes for each column)
- NO delete - simply ignore the problem (somebody else's problem :0)
- Rate limited single row delete (do complete table scan and potentially billions of delete statements)
- Split the table to multiple tables -> "CREATE TABLE eventlookup**YYYY**". Once a year is not needed, simply drop it. (Problem is every read should potentially query all tables)

Is there any other approaches we can consider ?

Is there a design decision we can make now ( we are not in production yet) that will mitigate the future problem?

If it's worth the extra space, track for ranges of `recordtimes` your `subelement_id` in a seperate table / columnfamiliy.

Then you can easily get the ids to delete for records having a specific age if you do not want to set a ttl a priori.

But keep in mind to make this tracking distribute well, just a single `date` will generate hotspots in your cluster and very wide rows, so think about some partition key like `(date,chunk)` where I uses a random number from 0-10 in the past for `chunk`. Also you might look at TimeWindowCompactionStrategy - here is a blog post about it:

[To see links please register here]

Your partition key is only set to `subelement_id`, so all tuples of 7 events for all recordtimes will be in *one* partition.

Given your table structure, you need to know all the `subelement_id` of all your data just to fetch a single row. So, with this assumption, your table structure can be improved a bit by sorting your data by `recordtime DESC`:

CREATE TABLE eventlookup (
subelement_id text,
recordtime timeuuid,
eventtype int,
parentid text,
partition bigint,
event_id text,
PRIMARY KEY ((subelement_id), recordtime)
)
WITH CLUSTERING ORDER BY (recordtime DESC);

Now all of your data is in descending order and this will give you a big advantage.

Suppose that you have multiple years of data (eg from 2000 to 2018). Assuming you need to keep only the last 5 years, you'd need to fetch data by something like:

SELECT * FROM eventlookup WHERE subelement_id = 'mysub_id' AND recordtime >= '2013-01-01';

This query is efficient because C* will retrieve your data and will stop scanning the partition exactly where you wanted to: 5 years ago. The big plus is that if you have tombstones after that point, well, they won't impact your reads at all. That means you can "safely" trim after that point safely by issuing a delete with

WHERE subelement_id = 'mysub_id' AND recordtime < '2013-01-01';

Beware that this delete will create tombstones that will be skipped by your reads, BUT they will be read during compactions, so keep it in mind.

Alternatively, you can simply skip the delete part if you don't need to reclaim your storage space, your system will always run smooth because you will always retrieve your data efficiently.

AND default_time_to_live = 157,680,000 // 5 years in seconds

tetartohedral334

bolus14784

opalinendtf

Mrpapilionid761