Question

I was reading and hearing some stuff about cloud computing and map-reduce techniques lately. I am thinking of playing around with some algorithms to get practical experience in that field and see what is possible right now.

Here is what I want to do: I would like to use some public cloud platform (e.g. Google App Engine, Google Map Reduce, Amazon ECS, Amazon Map Reduce) that comes with built in map reduce functionality or if it comes without built in support, use an additional map reduce java libary (e.g. Hadoop, Hive), and implement/deploy some algorithms.

Has anyone made some experience in that field and indicate a good point to start? Or name some combinations which have worked well in practice?

Thanks in advance!

Was it helpful?

Solution

Amazon EC2 has some pre-bundled Hadoop AMIs. See Running Hadoop on Amazon EC2 for a tutorial.

In particular, the Cloudera distribution comes to mind - it comes with Pig and Hive as well.

OTHER TIPS

Apache Hadoop is a major open-source Java distributed computing framework, and it includes a MapReduce subproject that is based off of the original Google MapReduce.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top