What is the purpose of dividing rows into columnfamilies if they can have different number/types of columns anyway?

StackOverflow https://stackoverflow.com/questions/14396001

  •  16-01-2022
  •  | 
  •  

Question

Given that a column family can have rows with arbitrary structure we could store all rows in a single "store" (avoiding the name 'columnfamily/table' on purpose). What is the purpose of column families then?

Was it helpful?

Solution 3

Reasons:

  • To have a different sort order for the columns within a row. The comparator is specified at column family creation time and can't be changed afterwards. So if you have rows which columns must be sorted alphabetically or numerically you have to create different column families.
  • Customize the storage options that can be set on per column family basis. E.g. caching or rows, compaction, deletion of expired columns, etc. Per column family storage options can be found here
  • Can't mix counter and non-counter columns in the same column family
  • As mentioned in other answers, due to logical cohesion - columns represent attributes of some entity identified by the row id.

OTHER TIPS

The simplest of all reasons is evident in the name itself "Column Family". A Column Family groups a bunch of related columns together. You could consider it as a namespace containing related columns.

For example the Column "Name" by itself lacks context, which can be provided by ColumnFamilies like "Employees" or "Cities". Or each Column would need to carry all of it's context by itself with no concept of related Columns.

Atomicity

In Cassandra 1.1 and below, the only atomic guarantee you have is that writes to the same row (i.e. with the same key) will be atomic.

Thus, you think very carefully about what you want in your columns, and what row those columns should be in so that your application will behave appropriately if a write fails.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top