Question

is it possible read mongodb data from hadoop mongodb plugin connector, processing data with mapreduce hadoop, and when output results not using hadoop mongodb plugin connector but leave result of mapreduce hadoop as is, in hdfs?

Was it helpful?

Solution

I think this previous answer on SO answers your question, with a minor change:

Is it possible to read MongoDB data, process it with Hadoop, and output it into a RDBS(MySQL)?

The main difference is that you would set the OutputFormatClass to something like:

job.setOutputFormatClass( SequenceFileOutputFormat.class );

You'll also need to set the output path on HDFS you want to save the data to. See their WordCount example for a full code example, but use the above as the output format instead of MongoOutputFormat.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top