MapReduce inefficient reducer
Question
What would cause only a single reducer in a MapReduce job apart from all the keys output by the map function being the same?
Solution
Possible causes:
- Your cluster still has the default setting of having only 1 reducer (= default value).
- Your code explicitly sets the value to be 1 reducer.
- You are running in local mode (i.e. no cluster at all).
Quote from mapred-default.xml
<property>
<name>mapred.reduce.tasks</name>
<value>1</value>
<description>The default number of reduce tasks per job. Typically set to 99%
of the cluster's reduce capacity, so that if a node fails the reduces can
still be executed in a single wave.
Ignored when mapred.job.tracker is "local".
</description>
</property>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow