Question

http://about.digg.com/blog/looking-future-cassandra

I've found this article about Digg's move to Cassandra. But I didn't get the author's idea of Bucket for pair (user,item). Little more details on the idea would be helpful to me to understand the solution better.

Thanks

Was it helpful?

Solution

It sounds like they are using one row in a super column family per user with one super column per item; a subcolumn for an item super column represents a friend who dugg the item. At least in pycassa, this makes an insert as simple as:

column_family.insert(user, {item: {friend: ''}})

They could also have done this a couple of other ways, and I'm not sure which they chose.

One is to use a standard column family, use a (user,item) combination for the row key, and use one column per friend who dugg the item:

column_family.insert(user + item, {friend: ''})

Another is to use a standard column family, use just (user) for the row key, and use an (item, friend) combination for the column name:

column_family.insert(user, {item + friend: ''})

Doesn't sound like this is what they used, but it's an acceptable option as well.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top