Question

I'm currently developing a set of map reduce tasks that have to be run in a particular order. I'm looking to use Oozie to manage the dependencies and running of this workflow. There's one key feature that I need, though, and I can't find any documentation that suggests that it is possible.

Basically, I am looking for a way to setup an action that checks to see if its output file is newer than the input file (and associated map-reduce code) has changed before executing the action. If so, it would skip executing the action. This way, I could make a change to a script and have only that stage of the workflow (and any that depend on its output) run.

Does anyone know how I'd go about doing this?

Était-ce utile?

La solution

How about using shell action in oozie where in you can run a shell script which actually checks for difference in the content of the defined file. And then on success of this action goto the map-red action and continue your job else goto fail case and kill your job.

Hope this idea helps you , If this is what you are looking for

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top