hadoop with mongodb plugin - read data

https://stackoverflow.com/questions/9879517

26-05-2021
|

Frage

I know that it is possible read and write data from mongodb via hadoop.

I want know if this adapter when read data from mongodb collection use native driver of mongodb, so it use mongod instance or this adapter read directy data collection?

Also when hadoop read data of mongodb for processing in a map reduce, this map reduce of hadoop don't lock data collection of mongodb?

in other word when hadoop read data of mongodb, hadoop save this data for hadoop use, and hadoop don't interfere with mongodb data because when hadoop execute mapreduce it work on data retrieve by mongodb but save internal at hadoop for processing?

Lösung

No data is cached or saved within Hadoop using the mongo-hadoop plugin.

Instead, each chunk is read into Hadoop as an individual input split to paralellize the Hadoop MapReduce job.

The only locking that occurs in mongodb is a light read lock as data is read from Mongo.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow