Domanda

Does the worker nodes in a hadoop cluster need hadoop installed on each one ?

What if I need only the computing power of some PCs can I use only map-reduce without installing HDFS on each node ?

È stato utile?

Soluzione

When you say worker nodes it includes both DataNodes and TaskTracker. So in that sense you need them on each machine if you wish to run MR jobs.

But the main point here is what would you do with MR alone. I mean running MR jobs on data stored in local FS is not gonna be of much use as you can't harness the power of distributed data storage and parallelism provided by Hadoop in that situation.

Altri suggerimenti

To use computing power of node you need to run TaskTracker on that node. Hence, Hadoop must be installed.

If you don't need HDFS, you can run only TaskTracker and don't start DataNode.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top