Question

So I've defined a column family that uses composite ids for the row keys. So say the composite key is CompositeType(LongType,LongType). So I've tested storing items with this type and that works fine and SELECT works as expected too when I know the full key. But lets say I want all keys that have 0 as the first element and anything as the second. So far the only way that I can see to perform this query is as follows:

if I was all keys that are 0:* then I would do a CQL query for key >= 0:0 AND key < 1:0 which works as long as there is an order preserving partitioner.

My questions are:

1) is this odd syntax only because I'm using a CQL driver (only option for nodejs aside from thrift)

2) is there any inefficiency with this type of query? essentially i'm using a composite key instead of super columns since those aren't supported in CQL. I have no problem dealing with this logic in the code as long as there is no limitations to using it like this.

Was it helpful?

Solution

I would suggest you change your data model. Use RandomPartitioner and just have the first component as the row key. Push the second component into the column names, that is make your column names composites instead.

Since column names are always sorted, you can do easy slicing operations. For example,

a) When you know both the components, do a get slice on the row key(first component) and first component of the composite.

b) When you know just the first component, fetch the complete row for the row key(first component)

This is the approach CQL3 takes when you ask it to create a table with multiple primary keys.

OTHER TIPS

Your best option is to use CQL 3. This will let you use composites underneath to optimize your lookups while still allowing you to use the parts of the composite values as though they were separate columns. You're currently using composites in your row keys, and CQL 3 only supports composites in column names (so far), but that's probably ok. In many cases like this, shifting the compositing from the row key to the column name won't have an adverse effect on your performance or data distribution, but if your row keys aren't sufficiently selective, then it might.

Either way, though, you should be looking at CQL 3. CQL 2 is deprecated. I could tell you more about how to adapt your model for CQL 3 if I knew more about your situation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top