Import module in MRJob on EMR
문제
Simple question: I have a module headers.py which defines a couple variables I need in my main MRJob script. I should be able to run the job with
python MRMyJob -r emr --file=headers.py s3://input/data/path
and then in my MRJob script (MRMyJob), the following should work:
from headers import header1, header2, header3
Right? From the mrjob --help page: "--file=UPLOAD_FILES Copy file to the working directory of this script. You can use --file multiple times."
I'm still getting "no module named headers" when I try to import it.
해결책
headers.py
is apparently not put in your remote PYTHONPATH
. See the docs on how to get additional modules across to the cluster; you have to put them in a tarball first.
제휴하지 않습니다 StackOverflow