Pregunta

Say many apps are using the same Couchbase backend, and I want to perform some batch analysis on the data they have generated. If I use the map/reduce functionality in Couchbase will this cause any issues considering the db still has to be able to store new data coming from running apps?

Would it be overkill to run mongo in conjunction with couchbase, where all apps store data their data to couchbase, this data is duplicated to mongo. The analysis is then carried out using mongo (and the mongo-hadoop connector).

¿Fue útil?

Solución

Ok, you really need to add more detail in terms of what queries you'll need to be running and the type and structure of data you are storing. I'll try and answer each of your queries on a broad level.

Would it be overkill to run mongo in conjunction with couchbase?

Yes most definitely so! This sounds like a bad idea, both fill the same space (document stores) with different strengths and weaknesses).

Can Couchbase do map reduce and still serve reads and writes at a high level?

Yes certainly, but views in Couchbase are eventually consistent where as key/access is always consistent. You can change the views to be consistent but then the map/reduce jobs have to run a lot more (STALE=FALSE flag) which WILL affect how fast the data is returned.

Couchbase has an ElasticSearch and Hadoop connector that allows data to be replicated from your cluster(s) to ES or Hadoop automatically. Personally we use ElasticSearch for more advanced analytics/free text search without impacting on our Couchbase cluster.

MongoDB or Couchbase?

Use one but not both, we use Couchbase in production but MongoDB is also more than capable of filling the same role (with more flexible querying too). MongoDB can also easily integrate with Hadoop/ElasticSearch.

I'd really go back and look at your data, how you'll need to access it and the volume of data, most probably Hadoop or ES will be overkill.

If you are needing the ability to identify spending patterns or grouping events by days/hours/minutes then either document store will easily be able to handle that.

I'm sure someone with production Mongo experience will also chip in!

Otros consejos

Yes can use map/reduce functionality on Couchbase cluster...you will need to add more nodes to increase throughput of the cluster (horizontally scale)...so it will have more computing power to serve concurrent request from the clients and also maintain map/reduce views.

As for data offloading to some other place, you can use XDCR (cross data center replication) functionality and keep data in sync in another Couchbase cluster that might be used solely for map/reduce functionality. So you don't need to use MongoDB at all... in fact VIBER has replaced MongoDB with Couchbase

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top