toth was right; I had set mapred.tasktracker.map.tasks.maximum to be too high, and the memory requirement was absurd. Amazon's default values are in general appropriate here.
Amazon Elastic MapReduce completes bootstrap actions on master node, but hangs on core nodes
-
21-06-2023 - |
Frage
I'm running an Amazon Elastic MapReduce (EMR) job on 1 master node and 25 core nodes. Bootstrap actions are completed on the master node, but they hang on core nodes. ~5000 (of 5200) tasks constituting a map step are then reported to be "running," while the remaining tasks are "pending." Because the core nodes are hanging, however, nothing is actually being run; I can tell because no intermediate output is being written. After ~30 minutes, all previously "running" tasks are stamped "killed_unclean" and shifted to "pending." A few minutes later, bootstrap actions are completed on the core nodes, but none of the tasks then shift from "pending" to "running."
This problem does not arise when I run my job with 2 core nodes rather than 25; tasks are finished as expected. What could be wrong, and how can I fix it?
Lösung