Choosing the right schema for cassandra "table" in CQL3

Question

The answer would be to use a clustering column. A clustering column allows you to create dynamic columns that you could use to hold the attribute name (col name) and it's value (col value).

The table would be

create table mytable ( 
    profile_id text,
    attr_name text,
    attr_value int,
    PRIMARY KEY(profile_id, attr_name)
)

This allows you to add inserts like

insert into mytable (profile_id, attr_name, attr_value) values ('131', 'a1', 3);
insert into mytable (profile_id, attr_name, attr_value) values ('131', 'a2', 1031);
.....
insert into mytable (profile_id, attr_name, attr_value) values ('131', 'an', 2);

This would be the optimal solution.

Because you then want to do the following 'The type of query we would be running on this table: select * from mytable where profile_id in (1,2,3,4,5423,44)'

This would require 6 queries under the hood but cassandra should be able to do this in no time especially if you have a multi node cluster.

Also if you use the DataStax Java Driver you can run this requests asynchronously and concurrently on your cluster.

For more on data modelling and the DataStax Java Driver check out DataStax's free online training. Its worth a look http://www.datastax.com/what-we-offer/products-services/training/virtual-training

Hope it helps.