is it possible read mongodb data from hadoop mongodb plugin connector, processing data with mapreduce hadoop, and when output results not using hadoop mongodb plugin connector but leave result of mapreduce hadoop as is, in hdfs?

有帮助吗?

解决方案

I think this previous answer on SO answers your question, with a minor change:

Is it possible to read MongoDB data, process it with Hadoop, and output it into a RDBS(MySQL)?

The main difference is that you would set the OutputFormatClass to something like:

job.setOutputFormatClass( SequenceFileOutputFormat.class );

You'll also need to set the output path on HDFS you want to save the data to. See their WordCount example for a full code example, but use the above as the output format instead of MongoOutputFormat.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top