Question

I'm still trying to wrap my head around the whole concept of composite keys in cassandra. I picked up this piece of code from https://github.com/thobbs/phpcassa/blob/master/examples/composites.php and I am struggling to understand what this means (please see questions below / in comments):

$cf->insert_format = ColumnFamily::ARRAY_FORMAT;
$cf->return_format = ColumnFamily::ARRAY_FORMAT;

$key1 = array("key", 1); // Which one of these is a column name?
$key2 = array("key", 2);

$columns = array(
    array(array(0, "a"), "val0a"), //Which is value, and which is column name?

    array(array(1, "a"), "val1a"),
    array(array(1, "b"), "val1b"),
    array(array(1, "c"), "val1c"),

    array(array(2, "a"), "val2a"),

    array(array(3, "a"), "val3a")
);

/**
* What type of queries in (CQL if possible) can I achieve with this?
/

What I would like to understand is:

  • In array("key", 1); are key and 1 the two columns composing this key, or 1 is a value of key?
  • Which one of these is a column name, and which is value array(array(0, "a"), "val0a")?
  • In a tabular form (or as close as posible), how can I visualize this data as stored in a database. I know it's not stored in a table form, but just to help me understand.

I'm new to NoSQL technologies and this is twisting my mind.

Thank you for your assistance :-)

EDIT

Just a few more questions:

  • If you have a row with a composite primary key, does that mean all columns in that row have to be composites?
  • I would like to have a column family with following structure:

    CREATE COLUMN FAMILY users (
        userid int,
        username varchar,
        firstname varchar,
        lastname varchar,
        PRIMARY KEY (userid,username)
    )
    // How can I represent this structure with Phpcassa? 
    // I tried to make every column `array("firstname" => "my name")`, but it didn't work
    
  • Am I allowed to have one of my keys in the composite to be null (in the above example username = null) and maybe add a value later?
Was it helpful?

Solution

The next two lines of that example probably help:

$cf->insert($key1, $columns);
$cf->insert($key2, $columns);

I'm making slight guesses here since I don't know PHP, but it seems clear from the naming that cf is the columnfamily, and the two insert() calls are adding multiple columns to the two rows with keys $key1 and $key2.

The rows keys are composite keys, i.e. the first row key is a composite of the string "key" and the number 1. In phpcassa the composite keys are constructed as arrays, I believe.

$key1 = array("key", 1);
$key2 = array("key", 2);

Note that in the example, the row keys and the column keys are composite keys.

That makes $columns an array of columns; each column needs a name (key) and a value...

So for example array(0, "a") is a column name (the column names are also composite keys), and "val0a" is a column value.

The data could be visualised as follows: first, the general layout of rows and columns in Cassandra (showing 2 rows each with 3 columns, for example). Note that the columns don't have to follow a tabular structure - we can have name3 in one row and name4 in another, or completely unrelated column names in different rows.

row1 -> name1  name2  name3  ...
        val1   val2   val3   ...

row2 -> name1  name2  name4  ...
        val1   val2   val4   ...

Next, using some of the specific (composite) keys from the example (2 rows of 6 columns). This is how it is actually stored (assuming that this is the correct sort order for these columns, which would depend on the comparator).

("key", 1) ->  (0, "a")    (1, "a")    (1, "b")    (1, "c")    (2, "a")    (3, "a")
               "val0a"     "val1a"     "val1b"     "val1c"     "val2a"     "val3a"

("key", 2) ->  (0, "a")    (1, "a")    (1, "b")    (1, "c")    (2, "a")    (3, "a")
               "val0a"     "val1a"     "val1b"     "val1c"     "val2a"     "val3a"

but because of the composite keys, you could visualise it with another level of nesting (here, just expanding the column keys). This gives the same kind of structure that Cassandra Supercolumns were sometimes used for:

("key", 1) ->        0                 1                2               3
               "a" -> "val0a"    "a" -> "val1a"    "a" -> val2a"   "a" -> "val3a"
                                 "b" -> "val1b" 
                                 "c" -> "val1c"

I suspect it would become clearer if you run the example and can see the outputs!

Update to address the extra questions:

I think you can independently decide whether to use composite row keys and column keys: see the configuration lines, one for the column keys which are Long, Ascii, and one of the row keys which are Ascii, Long.

"comparator_type" => "CompositeType(LongType, AsciiType)",
"key_validation_class" => "CompositeType(AsciiType, LongType)"

You can't have a null key - in Cassandra you would simply omit that column (because it isn't really a table) and add it later if you want.

Just a brief comment on your column family design (since this answer is getting very long!). I'd consider why you want a composite primary key - surely the userid should be unique anyway?

You can just use a row per user, keyed on userid (or on a composite of userid,username if you really need to), then a column for each of the other fields. Pretty much like a standard relational table. I don't see any need to use composite column names here. Maybe find some simpler phpcassa examples first before trying the composite keys...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top