سؤال

I've been looking into CouchDB's attachments functionality. Basically, CouchDB allows you to store binary file data inside database records. Similar to MongoDB's GridFS. The project I'm wanting to build revolves heavily around file uploads, which I planned on storing in CouchDB. So this lead me to researching about how CouchDB clusters data, so that as my database grows, due to file attachments, I can cluster it out across multiple servers. I was disappointed to find that CouchDB does not have the ability to do this, out of the box. The CouchDB guide says to use something called couchdb-lounge, but that project is more than 2 years untouched, on Github. I don't think I'd feel comfortable building on that.

I found BigCouch, which appears to be a modified CouchDB with the exact clustering functionality that I need included, except that it looks like it lags behind the current stable CouchDB release. I did read, in a press release from a year ago, that they're working on merging BigCouch into the official CouchDB, but I don't know what the timeline for that looks like.

As a third option, it looks like Couchbase Server 2 is also based on CouchDB but has the clustering built on, amongst other features. I'm debating that as a viable option, too. It doesn't support the file attachments, though.

The fact that BigCouch will land in CouchDB, eventually, gives me some reassurance to go ahead and use BigCouch for now.

Should I use BigCouch? Why wouldn't everybody use BigCouch, if it's just CouchDB + clustering? There must be some down-side, right?

هل كانت مفيدة؟

المحلول

My needs are a bit different than yours at my job, but I've done work with Couchbase, CouchDB and BigCouch. I found BigCouch very easy to setup in the cloud and it only took one day to successfully create a cluster. We're investing in BigCouch and are committing to it for a major mobile initiative after doing our due diligence.

Reasons why:

  1. BigCouch is fairly easy to setup in a cloud environment. The documentation is light, but I was able to get a simple cluster up and running quickly. I would recommend keeping an eye on the private hostnames of the machines in a cloud environment. (I can send along my detailed notes for creating machines in the cloud if that helps.)

  2. BigCouch is maintained by Cloudant and of course it's open source, which is nice. The CTO of Cloudant told me they have already merged quite a bit of code into the Apache CouchDB project. Also Cloudant seems pretty stable, so we're counting on them to keep the project up to date. It seems like a good community (unlike something like TouchDB).

  3. From what I can tell BigCouch mostly wraps itself around the core CouchDB code/APIs. This is good because it makes me think they started with CouchDB as the foundation and didn't try to do too much on top of it. For example, CouchDB's replication is already very good and BigCouch hasn't tried to re-invent the wheel. They just added some things that Couch was missing.

  4. One downside to running BigCouch "raw" as opposed to with Cloudant is that Cloudant maintains their own internal fork that has more features. Our evaluation found that those features weren't needed though. They were a bit overkill for us.

  5. Couchbase specifically seems to be a step behind. It took a long time to get to Couchbase 2.0 and I've been disappointed with Couchbase prior to 2.0. I hear 2.0 is great but haven't had a chance to use it yet. I've felt kind of burned with releases prior to 2.0 for various reasons.

نصائح أخرى

Not everyone needs the clustering. The CouchDB team is intent on merging BigCouch soon after the almost-ready 1.3 release, so starting to look into BigCouch would certainly make sense (and I would personally definitely pick BigCouch over CouchBase or couchdb-lounge -- many of the BigCouch contributors are CouchDB committers, anyway).

The downside of clustering is the extra complexity of it. I would argue that unless you're already an experienced CouchDB user, using BigCouch from day 1 is perhaps a step too far.

As an alternative to learning how to set up and maintain a BigCouch deployment, you could go for an online CouchDB host like Cloudant and let them deal with the complexity of managing a cluster of machines. All you deal with is something which still looks like your local CouchDB instance.

Regarding storing files in CouchDB, why not store them in S3? (A lot cheaper than Cloudant btw)

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top