Follow on from an earlier question...
I have an Oozie workflow that contains a shell action that invokes a Python script that is failing with the following error.
IOError: [Errno 13] Permission denied: '/home/test/myfile.txt'
All the Python script (hello.py) tries to do is open a file. this code works fine when executed outside of Hadoop.
if __name__ == '__main__':
print ('Starting script')
filein = '/home/test/myfile.txt'
file = open(filein, 'r')
Here is my Oozie workflow.
<workflow-app xmlns="uri:oozie:workflow:0.4" name="hello">
<start to="shell-check-hour" />
<action name="shell-check-hour">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>hello.py</exec>
<file>hdfs://localhost:8020/user/test/hello.py</file>
<capture-output />
</shell>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end" />
</workflow-app>
If I try to give an absolute path to where the file is, I get permission denied.
filein = '/home/test/myfile.txt'
If I try just the file name, I get file not found. I don't understand this, the Python script and file are in the same HDFS location
filein = 'myfile.txt'
Maybe I need to modify my Oozie script to add the file as a param as well?