Hector's batch Mutation vs. using Hadoop jobs to load data into Cassandra?

https://stackoverflow.com/questions/7079554

hadoop
cassandra
hector

24-12-2020
|

Question

Can someone highlight the pros and cons for Hector's batch Mutation and using Hadoop jobs to load data into Cassandra?

I know in Hector you can do something like the following:

mutator.addInsertion(...);
mutator.execute();

And in Hadoop you can use MR jobs to load data into Cassandra.

I'm looking for the reasons to use or not to use each of them. Thanks!

Solution

If the datasource is not currently in hadoop (or hbase) I would recommend just a multi-threaded loader using Mutator as above to keep down the number of moving parts.

This gist is dated, but the approach would be similar: https://gist.github.com/397574

Let me know if you want more details.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow