Domanda

I need to store binary byte data in Cassandra column family in all my columns. Below is the code where I will be getting binary byte data. My rowKey is going to be String but all my columns has to store binary blobs data.

GenericDatumWriter<GenericRecord> writer = new GenericDatumWriter<GenericRecord>(schema); 
ByteArrayOutputStream os = new ByteArrayOutputStream(); 
Encoder e = EncoderFactory.get().binaryEncoder(os, null); 
writer.write(record, e); 
e.flush(); 
byte[] byteData = os.toByteArray(); 
os.close();

// write byteData in Cassandra.

I am not sure what should be the right way to create the Cassandra column family for the above use case? Below is the column family, I have created but I am not sure this is the right way to do that for above use case?

create column family TESTING
with key_validation_class = 'UTF8Type'
and comparator = 'UTF8Type'
and default_validation_class = 'UTF8Type'
and gc_grace = 86400
and column_metadata = [ {column_name : 'lmd', validation_class : DateType}];

Update:-

I am going to use Astyanax Client to retrieve the data from Cassandra. My use case is simple.

All my columns in my above Cassandra Column Family will store only the binary blobs data.

How about this column family? Does it look right?

create column family TESTING
with key_validation_class = 'UTF8Type'
and comparator = 'TimeUUIDType'
and default_validation_class = 'ByteType'
and gc_grace = 86400
and column_metadata = [ {column_name : 'lmd', validation_class : DateType}];

When I tried creating the above column family, I got this exception-

[default@profileks] create column family TESTING
...     with key_validation_class = 'UTF8Type'
...     and comparator = 'TimeUUIDType'
...     and default_validation_class = 'ByteType'
...     and gc_grace = 86400
...     and column_metadata = [ {column_name : 'lmd', validation_class : DateType}];

java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: Unknown timeuuid representation: lmd

I will be storing userId as the rowKey, then my column-name which will store the binary-blobs data and lastly the lmd as the DateType column.

È stato utile?

Soluzione

@Trekkie

If you're using Thrift client:

create column family TESTING
with key_validation_class = 'UTF8Type'
and comparator = 'TimeUUIDType'
and default_validation_class = 'ByteType'

*default_validation_class* is ByteType to store blob.

Since you did not specify how you want to access you data, you can use TimeUUIDType for natural ordering of your column

If you're using CQL3:

CREATE TABLE TESTING(
  partition_key text, //corresponds to row key
  column_name timeuuid,
  data blob,
  PRIMARY KEY(partition_key));

Altri suggerimenti

@Trekkie

I now understand your requirement:

  1. row key = text
  2. column name = byte for storage
  3. value = none

In the beginning, I was assuming that you store the binary data in column value, not in column name.

If you store data in column name, be very careful though because you cannot store more than 64K of data in column name. Are you sure you blob will never exceed 64K ?

http://wiki.apache.org/cassandra/FAQ#max_key_size

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top