will composite partition & compound key affect perfomance in cassandra?

https://stackoverflow.com/questions/23274985

09-07-2023
|

Question

Given below, CQL for 3 tables. Both have same column structure, But difference in setting the PRIMARY KEY.

tab1: NO compound primary key

CREATE TABLE tab1
(
key1 text,
key2 text,
key3 text,
key4 text,
data1 text,
data2 text,
data3 int,
PRIMARY KEY(key1,key2,key3,key4));

tab2: (key1,key2) forms compound primary key

CREATE TABLE tab2
(
key1 text,
key2 text,
key3 text,
key4 int,
data1 text,
data2 text,
data3 text,
PRIMARY KEY((key1,key2),key3,key4));

tab3: (key1,key2,key3) forms compound primary key

 CREATE TABLE tab3
    (
    key1 text,
    key2 text,
    key3 text,
    key4 int,
    data1 text,
    data2 text,
    data3 text,
    PRIMARY KEY((key1,key2,key3),key4));

While querying value1,value2,value3 is known and key4 is specified as a range. Sample CQL query,

select data1,data2,data3 from tab3 where key1='value1' and key2='value2' and key3='value3' and key4 > 1000 and key4 < 1000000 ;

key4 may have some 50,000 records.

Which TABLE Design in better?

Which design have better read/write performance?

Solution

If you need to support range queries over key4, then it needs to be a clustering column, so that rules out tab1. Since you're always specifying an exact value for key3, there's no need to make it a clustering column, so tab3 is a better choice than tab2. Leaving key3 in the partition key will partition your data more evenly around the cluster.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow