Domanda

I want to launch the MongoDB Hadoop Streaming connector, so I downloaded a compatible version of Hadoop (the 2.2.0) (see https://github.com/mongodb/mongo-hadoop/blob/master/README.md#apache-hadoop-22)

I cloned the git repository mongohadoop, changed the build.sbt hadoopRelease for 2.2 :

$ cat build.sbt
name := "mongo-hadoop"

organization := "org.mongodb"

hadoopRelease in ThisBuild := "2.2"

Then I launched:

$ ./sbt package
$ ./sbt mongo-hadoop-streaming/assembly
$ cp core/target/mongo-hadoop-core_2.2.0-1.2.0.jar ../hadoop-2.2.0/lib/
$ cp mongo-2.7.3.jar ../hadoop-2.2.0/lib/ # Previously downloaded
$ cd ../hadoop-2.2.0/
$ ./bin/hadoop jar ../mongo-hadoop/streaming/target/mongo-hadoop-streaming-assembly-1.1.0.jar -mapper ...

And I get this :

Exception in thread "main" java.lang.ClassNotFoundException: com.mongodb.hadoop.streaming.MongoStreamJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:249)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)

I don't understand why, I tried almost every version supposed to support streaming but I always get the same error !

I precise I am on Mac OS X. Thanks !

È stato utile?

Soluzione

That is actually a bug that will be fixed in an upcoming release. Need for that main class was removed but the generated manifest was not. You can tweak your jar by removing the Main-Class entry from the manifest in the streaming jar. If you run the script below in the directory where your streaming jar is, it'll fix that for you:

#! /bin/sh

M=META-INF/MANIFEST.MF
mkdir tmp
cd tmp
cp ../$1 .
JAR=$1

jar xf ${JAR}

sed -e '/Main-Class/d' ${M} >> ${M}.new 
mv ${M}.new  ${M}

jar cvfm ${JAR} ${M}

mv ${JAR} ..
cd ..
rm -r tmp

It's not super pretty but should get you over the hump. We'll try to get a formal 1.2.1 release out soonish. Here's the jira ticket in the meantime: https://jira.mongodb.org/browse/HADOOP-121

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top