Split Class not found during Hadoop Execution

https://stackoverflow.com/questions/19820204

04-07-2022
|

Question

I am having a strange issue. I have my own implementation of filesystem instead of default distributedfilesystem. I have added my filesystem in fs.default.name and impl. When hadoop execution is started(teragen program),I could see that writes and reads are happening fine for files job.xml,job.jar etc from JobTracker. But, once mapping tasks are allocated in jobtracker , it gives ioexception saying that split class is not found. I verified that all classes are present in the examples jar.

Also, to narrow down the issue, I retried the exactly same setup using default distributed system by just changing fs.default.name. It works perfectly!!.

Command

bin/hadoop jar hadoop-examples-1.2.1.jar teragen 100000 /user/hduser/terasort-input

Initially I felt this to be a classpath issue but, if same setup works for default distributed system, where can be the issue ?. How can my implementation in filesystem affect jobtracker and classpath?

I really appreciate your help.

2013-11-06 12:30:35,018 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201311061227_0001_m_000000_0: java.io.IOException: Split class PLLorg.apache.hadoop.examples.terasort.TeraGen$RangeInputFormat$RangeInputSplit not found
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:381)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:406)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.lang.ClassNotFoundException: PLLorg.apache.hadoop.examples.terasort.TeraGen$RangeInputFormat$RangeInputSplit
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:266)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:379)

Solution

Please check your filesystem implementation about Path conversion issues, like the following ones:

workingDirectory
filestatus: when return the fileStatus, a path is included in the fileStatus object which could be used later
makeQulified
pathToFile

You may also need to check:

Whether your filesystem is actually a shared filesystem, which means all the nodes (TT, JT) have the same namespace.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow