Domanda

I am quite new to Amazon services, and started reading about EMR. I am more or less familiar with OpenStack. I just want some one to tell me in short what plays the role of Compute, Controller and Cinder of storage in Amazon cloud.

For example Cinder is storage for OpenStack and likewise S3 is the storage in Amazon cloud.

What are the the other two - compute and controller in Amazon cloud?

Also, can some 1 please put up in simple words the relation between EMR and EC2 or are they entirely different ?

Even in EMR we use EC2 instances, so why are people comparing hadoop on EC2 vs Map Reduce like in the following link

Hadoop on EC2 vs Elastic Map Reduce

Thanks a ton in advance :)

È stato utile?

Soluzione

Openstack is an open source software that can be setup in your own cloud so that you can have your managed services like Amazon.

Amazon is it's own independent service with its own proprietary implementation and they basically sell the service.

So Openstack has several components that has a somehow 1-1 mapping with AWS services.

Controller -> Amazon Console Cinder -> EBS Storage -> S3 Compute -> EC2

EMR (Elastic Map Reduce) is just another service from Amazon that allows you to run hadoop jobs. EMR basically runs on top of EC2 so in essence when you create an EMR cluster it's using EC2 as its underlying service.

You can also run Hadoop independently from EMR on EC2 instances, the downside is that you have to manage all the Hadoop installation, configuration yourself (Cloudera manager is pretty helpful for this). The advantage is that it allows you to tweak as much as you want from the Hadoop stack.

Hope this helps.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top