Question

I have a small cluster with 6 64bits machines installed with cloudera manager (CM), which works perfectly. I want to add a set of 32 bits machines as datanode to this cluster. According to the CM doc it is not possible to install 32 bits machines with the manager. On the other hand I am able to install CDH4 manually on these 32 bits machines. Is there anyway I can connect all the machines together?

Was it helpful?

Solution

Yes it is possible to connect 32 and 64 bits machines on the same cluster even of the 64 bits machine have been installed with cloudera manager.

So basically you need to download and install the cloudera package for 32 bits machine on each 32-bits-host : sudo yum --nogpgcheck -y localinstall cloudera-cdh-4-0.i386.rpm

Then install hadoop-hdfs-datanode: sudo yum -y install hadoop-hdfs-datanode

If you want to assign the /my-hdfs-dir directory to the hdfs, make sure hdfs is the owner of this directory (for each 32-bits-host): chown -R hdfs /my-hdfs-dir

You need to add the 32-bits-host into the list of allowed hosts: dfs_hosts_allow.txt. In my case this file is located in /var/run/cloudera-scm-agent/process/847-hdfs-NAMENODE. Basically if you go to /var/run/cloudera-scm-agent/process/ you will find a list of processes, you need to modify the most recent NAMENODE process. To take into consideration the new nodes refresh the nodes: sudo -u hdfs hdfs dfsadmin -refreshNodes

You also need to configure each 32-bits-host. The simplest is to copy the core-site.xml and hdfs-site.xml files from one existing 64-bits slaves. You will find these files in /var/run/cloudera-scm-agent/process/xxx-hdfs-DATANODE on the slave. You can comment all properties containing the name of the salve you are copying the files from (these do not seem to be necessary). Once the file modified, copy on all 32-bits-host in /etc/hadoop/conf.

You can now start hdfs on the 32-bits-hosts: sudo service hadoop-hdfs-datanode start

You can check the new datanode is installed by browsing master_ip:50070. Unfortunately, I do not think there is a way to see these new machines in the cloudera manager web ui. If someone knows a solution, it is very welcome.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top