Question

I'm currently developing a set of map reduce tasks that have to be run in a particular order. I'm looking to use Oozie to manage the dependencies and running of this workflow. There's one key feature that I need, though, and I can't find any documentation that suggests that it is possible.

Basically, I am looking for a way to setup an action that checks to see if its output file is newer than the input file (and associated map-reduce code) has changed before executing the action. If so, it would skip executing the action. This way, I could make a change to a script and have only that stage of the workflow (and any that depend on its output) run.

Does anyone know how I'd go about doing this?

Was it helpful?

Solution

How about using shell action in oozie where in you can run a shell script which actually checks for difference in the content of the defined file. And then on success of this action goto the map-red action and continue your job else goto fail case and kill your job.

Hope this idea helps you , If this is what you are looking for

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top