Question

I am using Cloudera Manager Free Edition on my "Cluster" with all services on my single machine.

My machine acts as the datanode,namenode as well as the secondary namenode.

Settings in HDFS related to replication,

dfs.replication                                   - 1
dfs.replication.min, dfs.namenode.replication.min - 1
dfs.replication.max                               - 1   

Still I get under-replicated blocks and hence Bad Health,

The Namenode log says,

Requested replication 3 exceeds maximum 1
java.io.IOException: file /tmp/.cloudera_health_monitoring_canary_files/.canary_file_2013_10_21-15_33_53 on client 111.222.333.444
Requested replication 3 exceeds maximum 1
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.verifyReplication(BlockManager.java:858)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1848)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1771)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1747)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:439)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:207)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44942)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

I have altered the values,saved, Deployed Client Configuration, Restarted too. It's still the same.

What property do I need to set to make CM read replication factor as 1 instead of 3 ?

Was it helpful?

Solution 2

It's a client setting. Client wants to replicate file for 3 times. Canary test acts as a client. Looks like you have to tune hdfs canary test settings. Or toy could try to use Cloudera managr and set replication factor prop as final. It would forbid client to change this property.

OTHER TIPS

Change the replication factor directly in a shell

hadoop fs -setrep -R 1 /

If you have permission problems, what worked for me was to change the replication factor as the user of each file. I had to change the replication factor for oozie files as follows:

sudo -u oozie bash
hadoop fs -setrep -R 1 /

Repeat for each user which the permissions failed.

I faced this issue. In my case it was due to missing blocks. Please confirm if that is the case then go to hdfs://hostname:50070 and see the block report. Try to delete or uploads files for which blocks are missing. This should resolve you issue. That is how I resolved mine.

  1. Login using HDFS user #su - hdfs

  2. Execute this set of commands to fix under replicated blocks in HDFS manually

    # hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files`
    # for hdfsfile in `cat /tmp/under_replicated_files\`; do echo "Fixing $hdfsfile :" ;  hadoop fs -setrep 3 $hdfsfile; done
    
$ hadoop fs -setrep -R 1 /

or

update your hdfs-site.xml file property

dfs.replication=1

Well, It's not recommendable to hold both secondary namenode and namenode in the same node. Put in separate machine for better results.

Come to your question. I hope you are testing in your same machine. Cloudera mistakenly consider as you have three replicas, that's why this problem showed up. Form a separate cluster it should have a minimum 4 systems.

First check your hdfc configuration in hdfs-site.xml has this configuration or not

<property>
  <name>dfs.replication</name>
  <value>3</value>
</property>

I hope your cluster has 2 or 3 systems, so the rest of the replicas are not replicated properly, so that this problem showed up.

You can resolve this problem. Just open terminal enter this command

$ hadoop fs -setrep -R 1 /

Now replications overwrites and resolve this problem, or else add few systems either three or more to the existing cluster. It means commission process surely your problem will resolved.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top