Submitting the same coordinator job multiple times in oozie

https://stackoverflow.com/questions/14634285

06-03-2022
|

Pregunta

I have a coordinator job in Oozie. It calls the workflow with a java action node.

If I submit this job only once, then it works perfectly. However, if I submit this job twice with the same start and end time, but a different arg1 to the Main class, then both the job instances hang in the "RUNNING" state and the logs look like this:

>>> Invoking Main class now >>>

Heart beat
Heart beat
Heart beat
Heart beat
...

If I kill one of the jobs, then the other one starts running again.

The documentation states that it is possible to submit multiple instances of the same coordinator job with different parameters: http://archive.cloudera.com/cdh/3/oozie/CoordinatorFunctionalSpec.html#a6.3._Synchronous_Coordinator_Application_Definition

"concurrency: The maximum number of actions for this job that can be running at the same time. This value allows to materialize and submit multiple instances of the coordinator app, and allows operations to catchup on delayed processing. The default value is 1 ."

So what am I doing wrong? I even saw two instances of the workflow action from the same job being in the "RUNNING" state which ran fine once the other job was killed.

Solución

Ok I found the issue. It was related to HBase concurrency and not enough task slots in the cluster. Setting the following property in the mapred-site.xml file fixes the issue:

<name>mapred.tasktracker.map.tasks.maximum</name>
<value>50 </value>

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow