I've dealt with a similar issue with EMR in the past, the property you are looking for mapred.task.timeout
which corresponds to the number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string.
With MRJob, you could add the following option:
--jobconf mapred.task.timeout=1800000
EDIT: It appears that some EMR AMIs appear do not support setting parameters like timeout with jobconf at run time. Instead, you must use Bootstrap-time configuration like this:
--bootstrap-action="s3://elasticmapreduce/bootstrap-actions/configure-hadoop -m mapred.task.timeout=1800000"
I would still try the first one to start with and see if you can get it to work, otherwise try the bootstrap action.
To run any of these parameters, just create your job extending from MRJob
, this class has a jobconf
method that will read your --jobconf
parameters, so you should specify these as regular options on command line:
python job.py --num-ec2-instances 42 --python-archive t.tar.gz -r emr --jobconf mapred.task.timeout=1800000 /path/to/input.txt