Question

I'm developing a php platform that will make huge use of images, documents and any file format that will come in my mind so i was wondering if Cassandra is a good choice for my needs.

If not, can you tell me how should i store files? I'd like to keep using cassandra because it's fault-tolerant and uses auto-replication among nodes.

Thanks for help.

Was it helpful?

Solution

From the cassandra wiki,

Cassandra's public API is based on Thrift, which offers no streaming abilities 
any value written or fetched has to fit in memory. This is inherent to Thrift's 
design and is therefore unlikely to change. So adding large object support to
Cassandra would need a special API that manually split the large objects up 
into pieces. A potential approach is described in http://issues.apache.org/jira/browse/CASSANDRA-265.    
As a workaround in the meantime, you can manually split files into chunks of whatever 
size you are comfortable with -- at least one person is using 64MB -- and making a file correspond 
to a row, with the chunks as column values.

So if your files are < 10MB you should be fine, just make sure to limit the file size, or break large files up into chunks.

OTHER TIPS

You should be OK with files of 10MB. In fact, DataStax Brisk puts a filesystem on top of Cassandra if I'm not mistaken: http://www.datastax.com/products/enterprise.

(I'm not associated with them in any way- this isn't an ad)

As fresh information, Netflix provides utilities for their cassandra client called astyanax for storing files as handled object stores. Description and examples can be found here. It can be a good starting point to write some tests using astyanax and evaluate Cassandra as a file storage.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top