Question

I have several questions or rather confusions regarding CDH4. I am posting here since I did not get any concrete information regarding my questions.

Is CDH4 meant to promote YARN? I tried setting up MapReduce1 using CDH4.3.0 using tarball. I finally did but it is round about and painful. Whereas YARN set up is strait forward.

Is anyone using YARN in production at all? Apache clearly says that YARN is still in alpha version and not meant for production. In such cases why is Cloudera making CDH4 YARN centric? Does Cloudera support YARN in production?

Apologies if the questions are inappropriate.

This is how the tarball extract looks like.

CDH4.3.0 tarball extracts

I followed couple of links to do a configuration but I am not happy the way it had to be done

CDH 4.3.0 tarball and MR1

Was it helpful?

Solution

No, CDH4 is not meant mainly for YARN. CDH5, on the other hand, will be.

I'm not sure how you went about setting up your CDH cluster, but it's rather easy to add the MapReducev1 service, as opposed to YARN, using Cloudera Manager.

Very few companies use YARN in production, Yahoo being the most notable.

CDH4 is not YARN-centric. Cloudera includes YARN so people can have the most recent Hadoop bits accessible to them - but it's very clear on Cloudera's website that they do not recommend YARN for production.

One of the big things that CDH4 brought to the table last year was HDFSv2, and they made MRv1 compatible with it.

To install CDH4 with MRv1, see here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top