Question

Does the worker nodes in a hadoop cluster need hadoop installed on each one ?

What if I need only the computing power of some PCs can I use only map-reduce without installing HDFS on each node ?

Was it helpful?

Solution

When you say worker nodes it includes both DataNodes and TaskTracker. So in that sense you need them on each machine if you wish to run MR jobs.

But the main point here is what would you do with MR alone. I mean running MR jobs on data stored in local FS is not gonna be of much use as you can't harness the power of distributed data storage and parallelism provided by Hadoop in that situation.

OTHER TIPS

To use computing power of node you need to run TaskTracker on that node. Hence, Hadoop must be installed.

If you don't need HDFS, you can run only TaskTracker and don't start DataNode.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top