Question

Simple question: I have a module headers.py which defines a couple variables I need in my main MRJob script. I should be able to run the job with

python MRMyJob -r emr --file=headers.py s3://input/data/path

and then in my MRJob script (MRMyJob), the following should work:

from headers import header1, header2, header3

Right? From the mrjob --help page: "--file=UPLOAD_FILES Copy file to the working directory of this script. You can use --file multiple times."

I'm still getting "no module named headers" when I try to import it.

Était-ce utile?

La solution

headers.py is apparently not put in your remote PYTHONPATH. See the docs on how to get additional modules across to the cluster; you have to put them in a tarball first.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top