Question

i want to make a query for two column families at once... I'm using the cassandra-cql gem for rails and my column families are:

users
following
followers
user_count
message_count
messages

Now i want to get all messages from the people a user is following. Is there a kind of multiget with cassandra-cql or is there any other possibility by changing the datamodel to get this kind of data?

Was it helpful?

Solution

I would call your current data model a traditional entity/relational design. This would make sense to use with an SQL database. When you have a relational database you rely on joins to build your views that span multiple entities.

Cassandra does not have any ability to perform joins. So instead of modeling your data based on your entities and relations, you should model it based on how you intend to query it. For your example of 'all messages from the people a user is following' you might have a column family where the rowkey is the userid and the columns are all the messages from the people that user follows (where the column name is a timestamp+userid and the value is the message):

RowKey                              Columns
-------------------------------------------------------------------
|        | TimeStamp0:UserA | TimeStamp1:UserB | TimeStamp2:UserA |
| UserID |------------------|------------------|------------------|
|        | Message          | Message          | Message          |
-------------------------------------------------------------------

You would probably also want a column family with all the messages a specific user has written (I'm assuming that the message is broadcast to all users instead of being addressed to one particular user):

RowKey                   Columns
--------------------------------------------------------
|        | TimeStamp0 | TimeStamp1 | TimeStamp2        |
| UserID |------------|------------|-------------------|
|        | Message    | Message    | Message           |
--------------------------------------------------------

Now when you create a new message you will need to insert it multiple places. But when you need to list all messages from people a user is following you only need to fetch from one row (which is fast).

Obviously if you support updating or deleting messages you will need to do that everywhere that there is a copy of the message. You will also need to consider what should happen when a user follows or unfollows someone. There are multiple solutions to this problem and your solution will depend on how you want your application to behave.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top