Question

Yesterday I kicked off an oozie workflow. It started two jobs that stalled all day. I killed them this morning, having made a change that I now want to test. After killing the two jobs it's like the workflow became unstuck and is now proceeding. I would like to kill the workflow so it doesn't keep starting new jobs to replace the ones I kill. How can I do that in the oozie command line?

Was it helpful?

Solution

You can view your running jobs with:

oozie jobs

or if it's a coordinator, not a workflow:

oozie jobs -jobtype coordinator

And get the Job ID from there, then do:

oozie job -kill [id]

Here's the command line tool reference page: http://incubator.apache.org/oozie/docs/3.1.3/docs/DG_CommandLineTool.html

OTHER TIPS

Oozie commands
--------------
Note: Replace oozie server and port, with your cluster-specific.
 
1) Submit job:
$ oozie job -oozie http://localhost:11000/oozie -config oozieProject/workflowHdfsAndEmailActions/job.properties -submit job: 0000001-130712212133144-oozie-oozi-W
 
2) Run job:
$ oozie job -oozie http://localhost:11000/oozie -start 0000001-130712212133144-oozie-oozi-W
 
3) Check the status:
$ oozie job -oozie http://localhost:11000/oozie -info 0000001-130712212133144-oozie-oozi-W
 
4) Suspend workflow:
$ oozie job -oozie http://localhost:11000/oozie -suspend 0000001-130712212133144-oozie-oozi-W
 
5) Resume workflow:
$ oozie job -oozie http://localhost:11000/oozie -resume 0000001-130712212133144-oozie-oozi-W
 
6) Re-run workflow:
$ oozie job -oozie http://localhost:11000/oozie -config oozieProject/workflowHdfsAndEmailActions/job.properties -rerun 0000001-130712212133144-oozie-oozi-W
 
7) Should you need to kill the job:
$ oozie job -oozie http://localhost:11000/oozie -kill 0000001-130712212133144-oozie-oozi-W
 
8) View server logs:
$ oozie job -oozie http://localhost:11000/oozie -logs 0000001-130712212133144-oozie-oozi-W
 
Logs are available at:
/var/log/oozie on the Oozie server.

In addition to the post related to Oozie commands, sometimes we don't have to access to the respective workflow id to suspend/kill etc. and we get below error:

Error: E0508 : E0508: User [?] not authorized for WF job [0001304-190209190348229-oozie-mapr-W]

For this, to perform any operation like kill/suspend etc. we need to generate the authenticating token for our user id. For this, first, we need to clear the existing tokens from the file using below command and then perform suspend/kill etc. action on given workflow id:

rm .oozie-auth-token

From Apache Oozie docs:

Once authentication is performed successfully the received authentication token is cached in the user home directory in the .oozie-auth-token file with owner-only permissions. Subsequent requests reuse the cached token while valid.

For more details, the link of Apache Oozie docs (refer Authentication section): Official Documentation

I think you will find it helpful how to kill, rerun, etc multiple (example 200) jobs at the same time using bash.

In one single line:

$for jobid in `oozie jobs -filter status=SUSPENDED | cut -d" " -f1`; do echo "Killed job ${jobid}"; job -kill ${jobid}; done
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top