Frage

is it possible read mongodb data from hadoop mongodb plugin connector, processing data with mapreduce hadoop, and when output results not using hadoop mongodb plugin connector but leave result of mapreduce hadoop as is, in hdfs?

War es hilfreich?

Lösung

I think this previous answer on SO answers your question, with a minor change:

Is it possible to read MongoDB data, process it with Hadoop, and output it into a RDBS(MySQL)?

The main difference is that you would set the OutputFormatClass to something like:

job.setOutputFormatClass( SequenceFileOutputFormat.class );

You'll also need to set the output path on HDFS you want to save the data to. See their WordCount example for a full code example, but use the above as the output format instead of MongoOutputFormat.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top