Cassandra schema for timelines

Question

You can use TimeUUID as column keys, they can make sure you get unique keys even if you have multiple application servers writing data at the same time (although very unlikely two application servers could insert something at the exact same microtime value for the same user), and they will sort in chronological order just like a regular timestamp.

You might also want to use a reverse comparator if you expect that you will display the most recent items more often (for example if you want to show the ten most recent timeline items for a user). Using a reverse comparator means that Cassandra will store rows in reverse order, with the most recent items first. This means that the most recent items will be the easiest for Cassandra to find and you will get very good performance.

Another thing to think about is just how wide your rows will get. If you don't expect that a timeline will be longer than a million or so items (exactly how many depend on how much data there will be in each item) then having a single row per user will probably work (but again, try using a reverse comparator, otherwise reading the most recent items will be slow). If you expect your users to generate millions and millions of timeline items you need to think of a way to split up a user's timeline into many rows. Perhaps one row per user per month, or per day. It needs to be something that is deterministic so that you don't have to do a query to find wich row you should read -- and since your columns are sorted on time, using time to partition into multiple rows is natural.