How to have an Amazon EC2-like environment on my system?
-
28-10-2019 - |
Question
I've this hadoop project that someone else coded (link). I've the source. I want to implement this on my cluster (basically 3 ubuntu machines). But the mention project works on an EC2 platform (with Cloudera distribution).
So, what all should I install on my systems to make it have the software for running such a project?
I thought about Cloudera Manager, Oracle Java.
Solution
If the project works with cloudera distribution (not with EMR), you can install cloudera and it should be fine. Only corner I can expect as problematic - if s3 was used as a file system.
If the project indeed works against s3 you have two ways:
a) Try to replace s3 to hdfs and all file names / paths, and it should also work fine (if they are hardcoded).
b) Install OpenStack's Swift which is open source alternative to S3 and then try to run Hadoop over it. Disclosure: I am involved in project of running hadoop over Swift. https://github.com/Dazo-org/swift