Question

I'm having problems with using the time_uuid type as a key in my columnfamily. I want to store my records, and have them ordered by when they were inserted, and then I figured that the time_uuid is a good way to go. This is how I've set up my column family:

sys.create_column_family("keyspace", "records", comparator_type=TIME_UUID_TYPE)

When I try to insert, I do this:

q=pycassa.ColumnFamily(pycassa.connect("keyspace"), "records")
myKey=pycassa.util.convert_time_to_uuid(datetime.datetime.utcnow())
q.insert(myKey,{'somedata':'comevalue'})

However, when I insert data, I always get an error:

Argument for a v1 UUID column name or value was neither a UUID, a datetime, or a number.

If I change the comparator_type to UTF8_TYPE, it works, but the order of the items when returned are not as they should be. What am I doing wrong?

Was it helpful?

Solution

The comparator for a column family is used for ordering the columns within each row. You are seeing that error because 'somedata' is valid utf-8 but not a valid uuid.

The ordering of the rows stored in cassandra is determined by the partitioner. Most likely you are using RandomPartitioner which distributes load evenly across your cluster but does not allow for meaningful range queries (the rows will be returned in a random order.)

http://wiki.apache.org/cassandra/FAQ#range_rp

OTHER TIPS

The problem is that in your data model, you are using the time as a row key. Although this is possible, you won't get a meaningful ordering unless you also use the ByteOrderedPartitioner.

For this reason, most people insert time-ordered data using the time as a column name, not a row key. In this model, your insert statement would look like:

q.insert(someKey, {datetime.datetime.utcnow(): 'somevalue'})

where someKey is a key that relates to the entire time series that you're inserting (for example, a username). (Note that you don't have to convert the time to UUID, pycassa does it for you.) To store something more than a single value, use a supercolumn or a composite key.

If you really want to store the time in your row keys, then you need to specify key_validation_class, not comparator_type. comparator_type sets the type of the column names, while key_validation_class sets the type of the row keys.

sys.create_column_family("keyspace", "records", key_validation_class=TIME_UUID_TYPE)

Remember the rows will not be sorted unless you also use the ByteOrderedPartitioner.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top