A Hadoop cluster is a combination of Master & Slave nodes, the Master nodes being NameNode, Secondary NameNode and Job Tracker
& the Slave nodes being DataNodes and TaskTrackers
. In any hadoop cluster, there is 1 instance of the Master nodes, i.e 1 NameNode, 1 Secondary NameNode and 1 Job Tracker
and multiple slave nodes (dataNodes and taskTrackers)
What is a Hadoop Cluster?
-
06-10-2022 - |
Question
I am confused because I see the word 'Cluster' being used in multiple scenarios. Which of these below is the proper use of 'Cluster'?
- All the NameNodes, Job Trackers and Data Nodes
- One set of NameNode, JobTracker, and DataNodes (there being many more such sets)
Edit:
An example of such confusing line is at this page:
DistCp (distributed copy) is a tool used for large inter/intra-cluster copying
If all the servers(nodes) together is a cluster, then where is the question of 'inter cluster copying'? If a subset of servers is a cluster, then what exactly is that subset?
Solution
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow