Pergunta

Does the worker nodes in a hadoop cluster need hadoop installed on each one ?

What if I need only the computing power of some PCs can I use only map-reduce without installing HDFS on each node ?

Foi útil?

Solução

When you say worker nodes it includes both DataNodes and TaskTracker. So in that sense you need them on each machine if you wish to run MR jobs.

But the main point here is what would you do with MR alone. I mean running MR jobs on data stored in local FS is not gonna be of much use as you can't harness the power of distributed data storage and parallelism provided by Hadoop in that situation.

Outras dicas

To use computing power of node you need to run TaskTracker on that node. Hence, Hadoop must be installed.

If you don't need HDFS, you can run only TaskTracker and don't start DataNode.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top