Pergunta

I am using HCatalog's WebHCat API to run Pig jobs, such as documented here:

https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Pig

I have no problem running a simple job but I would like to attach a parameters file to the job, such as one can do using pig command line's parameter: --param_file .

I assume this is possible through arg request's parameter, so I tried multiple things, such as passing:

'arg': '-param_file /path/to/param.file'

or:

'arg': {'param_file': '/path/to/param.file'}

None seems to work, and error stacks don't say much. I would love to know if this is possible, and if so, how to correctly achieve this.

Many thanks

Foi útil?

Solução

Correct usage:

'arg': ['-param_file', '/path/to/param.file']

Explanation: By passing the value in arg,

'arg': {'-param_file': '/path/to/param.file'}

webhcat generates "-param_file" for the command prompt. Pig throws the following error

ERROR org.apache.pig.Main - ERROR 2999: Unexpected internal error. Can not create a Path from a null string

Using a comma instead of the colon operator passes the path to file as a second argument. webhcat will generate "-param_file" "/path/to/param.file"

P.S: I am using Requests library on python to make the REST calls

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top