Question

Can someone highlight the pros and cons for Hector's batch Mutation and using Hadoop jobs to load data into Cassandra?

I know in Hector you can do something like the following:

mutator.addInsertion(...);
mutator.execute();

And in Hadoop you can use MR jobs to load data into Cassandra.

I'm looking for the reasons to use or not to use each of them. Thanks!

Was it helpful?

Solution

If the datasource is not currently in hadoop (or hbase) I would recommend just a multi-threaded loader using Mutator as above to keep down the number of moving parts.

This gist is dated, but the approach would be similar: https://gist.github.com/397574

Let me know if you want more details.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top