Question

I read on an article called "Hands-on Cassandra" that Tokyo Cabinet is not good for big data. Why? How many bytes TC needs to store before start to work bad? Is is possible to determine a approximated value?

Was it helpful?

Solution

Based on this article, there's a confirmed performance degradation past 500GB.

Based on this wide comparison of NoSQL databases, the problems in TC start at >20mm rows.

Among the possible causes of size dependency is the fact that it seems TC is implemented using hashes, and at some point you run into hash key collisions which of course ruins the performance. By default, key space is not as large as can be (you need to tune "bnum" parameter - number of elements of the bucket array - to increase performance)

Based on various comparisons, MongoDB seems to be the recommended approach for large datasets.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top