Question

I'm new to HDInsight. I want to learn and practice machine learning, HDInsight is just what i want, but there seems no direct API to mahout. Since mahout recommendation will translate to mapredure job essentially, so I followed some mapreduce example on Windows Azure documentation and write the following code:

// Define the MapReduce job
MapReduceJobCreateParameters mrJobDefinition = new MapReduceJobCreateParameters()
{
    JarFile = "wasb:///example/jars/mahout-core-0.9-job.jar",
    ClassName = "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob",
};

mrJobDefinition.Arguments.Add(" -s SIMILARITY_COOCCURRENCE");
mrJobDefinition.Arguments.Add(" --input=/reply");
mrJobDefinition.Arguments.Add(" --output=/recommend/");
mrJobDefinition.Arguments.Add(" --usersFile=/data/users.txt");

I have already upload the "mahout-core-0.9-job.jar" to /example/jars in the specified Azure blob storage container.

But I received the following error message:

14/04/03 12:04:28 ERROR security.UserGroupInformation: PriviledgedActionException as:johnny cause:java.io.IOException: Exception reading file:/c:/apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken= java.security.PrivilegedActionException: java.io.IOException: Exception reading file:/c:/apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken= at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:951) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:164) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:322) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.io.IOException: Exception reading file:/c:/apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken= at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:136) at org.apache.hadoop.mapred.JobClient.readTokensFromFiles(JobClient.java:2149) at org.apache.hadoop.mapred.JobClient.populateTokenCache(JobClient.java:2185) at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:964) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:951) ... 16 more Caused by: java.io.FileNotFoundException: File file:/c:/apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken= does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:427) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:254) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:125) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436) at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:130) ... 21 more Exception in thread "main" java.io.IOException: Exception reading file:/c:/apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken= at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:136) at org.apache.hadoop.mapred.JobClient.readTokensFromFiles(JobClient.java:2149) at org.apache.hadoop.mapred.JobClient.populateTokenCache(JobClient.java:2185) at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:964) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:951) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:951) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:164) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:322) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.io.FileNotFoundException: File file:/c:/apps/temp/hdfs/mapred/local/taskTracker/johnny/jobcache/job_201404031203_0001/jobToken= does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:427) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:254) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:125) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436) at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:130) ... 21 more Shutting down watcher/keep alive thread pool forcefully templeton: job failed with exit code 1

After I Googled on the internet, it seems some change should be made to mapred-site.xml or other hadoop config files.But I'm totally new to Apache hadoop and doesn't have much knowledge about Linux and Java.

Any help or direction would be much appreciate.

Was it helpful?

Solution

With the latest .NET SDK for Hadoop (http://hadoopsdk.codeplex.com/), I can submit the mahout job with the same code successfully. It seems this issue has been resolved by the SDK.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top