Question

I'm trying to understand Cassandra's storage engine when it comes to composite columns. Unfortunately, the documentation I've read so far contains errors and is leaving me a bit blank.

First, terminology.

Composite columns comprise fully denormalized wide rows by using composite primary keys.

This seems misleading because, AFAIK, composite columns can be used for composite keys, and also simply as composite columns apart from keys.

1: How are composite keys and column names implemented? Every CQL example I can find only shows composite keys as columns, not plain composite columns.

Let's say we have columns 'a', 'b', 'c', 'd' as primary composite key + columns 'e', 'f'. I know 'a' will be the row and partition key.

Let's suppose the following data:

a    b    c    d    e    f
1a   1b   1c   1d   e1   f1
1a   1b   1c   2d   e1   f2
1a   1b   1c   2d   e2   f3
2a   2b   2c   2d   e2   f4

2: How is this stored under the hood? I suppose the real question here is how is 'b', 'c', 'd' mapped out since columns are not hierarchical by definition.

3: The documentation I read says compact storage should no longer be used. But what if non-primary key columns don't need to be added... what's the reason not to use it then?

Was it helpful?

Solution

1: How are composite keys and column names implemented?

Mostly answered with question 2. As an aside, in Cassandra 1.2, non-composite keys will also be implemented as composite keys under the hood. Also, the names themselves of composite columns are not repeated in storage. The in-memory representation interns the names up to a threshold for memory efficiency.

2: How is this stored under the hood?

The first key component (a in your example) becomes the physical row key. Rest of the columns form the prefix for non-composite columns and are stored presorted (clustered) within a row. So, physical representation for your example will be like this:

    1b.1c.1d, e   1b.1c.1d, f
1a      e1            f1
------------------------------
    2b.2c.2d, e   2b.2c.2d, f
2a      e2            f4

Note that the second and third rows in your example are not valid. Column names must be unique within physical rows.

The dot notation I used (1b.1c.1d) is figurative. Actual storage uses prefix bytes for metadata followed by data.

The documentation I read says compact storage should no longer be used. But what if non-primary key columns don't need to be added... what's the reason not to use it then?

The very small storage efficiency is not worth the downside of not having evolvability in your schema.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top