Hadoop WordCount Example- Run On Hadoop(Eclipse) option is not prompting Select Hadoop server to run on window

StackOverflow https://stackoverflow.com/questions/16133082

Question

I am trying to run word count example on Eclipse. Generally when we click on "run on hadoop" option in eclipse we get a new window asking to select server location. But, now it is directly running the program without asking me to choose an existing server from list below.

I think because of this I am getting the following exception:

13/04/21 08:46:31 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser1 cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/hduser1/gutenbergIP/pg4300.txt
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/hduser1/gutenbergIP/pg4300.txt

My code works if I change the line from:

FileInputFormat.setInputPaths(conf, "/home/hduser1/gutenbergIP/pg4300.txt");

to:

FileInputFormat.setInputPaths(conf, "hdfs://localhost:54310/home/hduser1/gutenbergIP/pg4300.txt");

If I explicitly specify the file name with full url it is working. I wanted some help regarding this. How can I make my relative URL work instead of giving a full URL(have to submit this as assignment in school).

Was it helpful?

Solution

Add the following 2 lines in your code :

config.addResource(new Path("/HADOOP_HOME/conf/core-site.xml"));
config.addResource(new Path("/HADOOP_HOME/conf/hdfs-site.xml"));

If you don't specify this your client will look into the local FS, which doesn't contain the specified path, hence throwing that error.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top