Discussed this more in cloudera cdh user mailing thread:
https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/GeT1RTbRVcw
In summary,
Regular DFS commands like
dfs -cp
between CDH3 and CDH4 will not work as both have different protocol versions (and are incompatible with one another over regular RPC calls).Distcp can be used to copy hdfs data cross clusters even from cdh3 to cdh4, but a few prerequisites here: you need to run the distcp command on the cdh4 cluster, also cdh4 cluster needs to have mapred deployed/available. cdh3 cluster doesn't necessarily need mapred.
when running distcp command, do not use hdfs for the source path, use hftp for the source path while hftp for the destination path(since hftp is READ-ONLY, you will need write-access to the destination path) so the command looks like:
hadoop distcp hftp://hadoop-namenode.cluster1/hbase hftp://hadoop-namenode.cluster2/hbase