Question

I am running hadoop in pseudo distributed mode in windows using Eclipse. I need to pass some options [ jvm related ] to the reducers. I have tried using:

<property>
    <name>mapred.reduce.child.java.opts</name>
    <value>-Dtca.TCA_PROPERTIES=C:\Users\uagrawal\workspace\TCAenv -DMDAPI=C:\Users\uagrawal\workspace\mdapi</value>
  </property>

but I am not successfull. Earlier when I used local standalone mode there I only have to suggest these jvm options in run dialog box and they worked perfectly fine. But in the pseudo distributed mode even suggesting these parametes in run dialog box is not working.

This is the error I get in pseudo distributed mode:

MDV_DATE not found....
java.lang.ClassCastException: com.itginc.tca.config.Config cannot be cast to com.itginc.tca.config.TcaConfig

These above errors are because the program did not get the mdapi and tcaenv file.

Était-ce utile?

La solution 2

Instead of adding the value of "mapred.reduce.child.java.opts" in mapred.xml to "-Dtca.TCA_PROPERTIES=C:\Users\uagrawal\workspace\TCAenv -DMDAPI=C:\Users\uagrawal\workspace\mdapi" I changed the value of "mapred.child.java.opts" to "-Dtca.TCA_PROPERTIES=C:\Users\uagrawal\workspace\TCAenv -DMDAPI=C:\Users\uagrawal\workspace\mdapi".

I think the reason why this works is because in pseudo-distributed mode each child process of the task tracker has its own jvm and it is better to set the property of child jvm.

Autres conseils

To pass command line argument you need to use -D name=value. For example you want to configure temp directory for all task then may use following command

hadoop jar JAR.jar mainClass -D mapreduce.task.tmp.dir="/path/to/temp/dir"

Don't forget to put space between -D and name=value Then you can configure useing GenericOptionsParser

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top