Question

I am relatively new to columnar database, please forgive ignorance. Lets say I have 1,000,000 columns. I would like to return a random sample of 10% of those columns (ie c0, c10, c20...c999,980, c999,990)

In HBase they have column filters, I could write a column filter that returned every tenth result. Can I do this in Pycassa/Cassanda?

Thank you

Was it helpful?

Solution

The only thing you can do server side is slices. So you can read starting at column=C10 limit=10 to get columns 10-19. Or you can ask for specific columns, so you could ask for every 10th column manually if you knew how many columns there were.

OTHER TIPS

You could do this easily client-side with Pycassa, but Cassandra does not support server-side filtering.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top