Question

I found the following from this post:

create table posts(username varchar, time timeuuid, post_text varchar, primary key(username, time))

There will only be as many CF rows as there are variations of the first element in your primary key. This can be a problem if this element has a very low cardinality as you can end up with very wide CF rows.

My point is:

what I have bolded above, shouldn't this be second element in primary key. That is, the secondary element or clustering element causes wide row correct?

Was it helpful?

Solution

It's a problem of definitions and of dictionary. A wide-row and a row are not the same thing -- giving a definition I would say that in a Table with the PK(partition, clustering) there will be as many wide rows as the number of partition keys. The number of rows is instead given by the sum of all clustering keys for each partition.

So in the sentence you quoted the author wrote "rows" but he meant "wide-rows".

There will only be as many CF wide-rows as there are variations of the first element in your primary key. This can be a problem if this element has a very low cardinality as you can end up with very wide CF rows.

Probably at the time of writing the term wide-row was not so used. So given such a table

CREATE TABLE wide_rows (
  partitionkey text,
  clusteringkey text,
  data text,
  PRIMARY KEY ((partitionkey), clusteringkey)
)

there will be only partitionkey wide-rows, but rows number depends on both partition and clustering

insert into wide_rows(partitionkey, clusteringkey, data) VALUES ( 'eagertoLearn', 'stackoverflow', 'cassandra question');
insert into wide_rows(partitionkey, clusteringkey, data) VALUES ( 'eagertoLearn', 'google groups', 'cql question');
insert into wide_rows(partitionkey, clusteringkey, data) VALUES ( 'eagertoLearn', 'askubuntu', 'linux shell question');
select * from wide_rows where partitionkey = 'eagertoLearn';

 partitionkey | clusteringkey | data
--------------+---------------+----------------------
 eagertoLearn |     askubuntu | linux shell question
 eagertoLearn | google groups |         cql question
 eagertoLearn | stackoverflow |   cassandra question

(3 rows)

CQL say that I've got 3 rows back, but these 3 rows belongs to the same partition key so this is 1 wide row.

HTH, Carlo

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top