How can I chain jobs in Hadoop while able to read original unput

https://stackoverflow.com/questions/22820498

hadoop
chain

26-06-2023
|

Question

I want to chain 3 rounds of MapReduce and at the third one to be able to read the original input as well as the output of the second job. Is this at all possible?

Solution

You could setup the last Job to use 2 mappers one of which can have the original file as the input. Assuming you would need to reduce both the inputs (input of first job & output of second job) on some common key. MultipleInputs

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow