Question

I want to know if BigTable has an upper limit on the size of cell content. By a BigTable cell, I mean, the cell in the table's specific column family, specific row, and a specific time stamp. If yes, then what is the upper limit, if not, then how much can the size of the cell be scaled up to without adversely affecting performance?

The BigTable paper says that each SSTable File has 64Kb blocks internally and an index. Does this mean that the key to the index is row+column+timestamp (where + represents concatenation) and the value mapped by a certain key the is corresponding cell of 64kb? So does this mean that a BigTable cell cannot exceed more than 64 Kb?

Thanks

Was it helpful?

Solution

Are you referring to google's specific implementation of big table? I imagine only someone at google can answer that question.

The paper itself doesn't limit cells to 64 kb. Although it doesn't mention it, I imagine cells can span multiple sstable blocks.

Two of the major open source implementations of the big table data model both allow cells larger than 64kb. Apache Cassandra has a technical limitation of 2gb although the practical limitation is much smaller. The Apache HBase faq recommends not using cell sizes over 10mb, but I'm unsure of the actual technical limitation.

OTHER TIPS

According to Designing Your Schema in the BigTable documentation there are recommended limits and hard limits for individual values.

Recommended Limit: ~10 Mebibytes (10.4858 Megabytes)

Hard Limit: 100 Mebibytes

However it is worth noting that there are also recommended and hard limits for rows and having enough cells that meet the recommended/hard limits could well exceed the row limits.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top