Вопрос

is it possible read mongodb data from hadoop mongodb plugin connector, processing data with mapreduce hadoop, and when output results not using hadoop mongodb plugin connector but leave result of mapreduce hadoop as is, in hdfs?

Это было полезно?

Решение

I think this previous answer on SO answers your question, with a minor change:

Is it possible to read MongoDB data, process it with Hadoop, and output it into a RDBS(MySQL)?

The main difference is that you would set the OutputFormatClass to something like:

job.setOutputFormatClass( SequenceFileOutputFormat.class );

You'll also need to set the output path on HDFS you want to save the data to. See their WordCount example for a full code example, but use the above as the output format instead of MongoOutputFormat.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top