Question

I learned to create job flows through command line interface. I am using amazon elastic mapreduce cli to create and run job flows. My doubt is that I want to place my source files into the master node from s3 bucket using Bootstrap actions at creation time of the job flows created. I am using Pig Program in interactive mode to run my job flows .

I read documents for bootstrap actions but those things was not clear to me.

Thanks in advance can any one tell me how to copy my files from s3 bucket to the master node by using Bootstrap actions.

Was it helpful?

Solution

Bootstrap actions are just standard unix scripts. Ensure the shebang points to an interpreter on the machines in your cluster and you're good to go.

When you say source files do you mean your pig scripts? These can be ran directly off s3. If you are talking about data, you should read these directly off s3 unless you have a use case for copying to the cluster first(like scanning the same data multiple times)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top