How to compile java file which calls MeCab - Japanese part-of-speech & morphological analyzer?
-
28-10-2019 - |
Question
I'm trying to use MeCab (http://mecab.sourceforge.net/#download) to do the word segmentation of Japanese sentences as well to tag every word by part of speech. I installed MeCab by following these instructions http://mecab.sourceforge.net/#install-unix. Since I don't want to write shell scripts to process 150,000 sentences (as my Mac OS X Terminal have problems showing Japanese characters), I'm using existing binding for Java: http://sourceforge.net/projects/mecab/files/mecab-java/0.98pre3/. At this point I'm trying to compile and run the given test.java file:
import org.chasen.mecab.MeCab;
import org.chasen.mecab.Tagger;
import org.chasen.mecab.Node;
public class test {
static {
try {
System.loadLibrary("MeCab");
} catch (UnsatisfiedLinkError e) {
System.err.println("Cannot load the example native code.\nMake sure your LD_LIBRARY_PATH contains \'.\'\n" + e);
System.exit(1);
}
}
public static void main(String[] argv) {
System.out.println(MeCab.VERSION);
Tagger tagger = new Tagger();
String str = "太郎は二郎にこの本を渡した。";
System.out.println(tagger.parse(str));
Node node = tagger.parseToNode(str);
for (;node != null; node = node.getNext()) {
System.out.println(node.getSurface() + "\t" + node.getFeature());
}
System.out.println ("EOS\n");
}
}
Here's the README:
1. Build UTF-8 dictionary
2. How to use?
See test.java as sample program.
% java -classpath MeCab.jar test -d ../dic
I compile: javac test.java. Then I run: java -classpath MeCab.jar test -d ../dic. The result is the following error:
Exception in thread "main" java.lang.NoClassDefFoundError: //
Caused by: java.lang.ClassNotFoundException: ..
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
I don't really understand the hierarchy of this mecab-java-0.98pre3 directory, so don't see how to actually compile and run this test.java. Any ideas, guys? Thanks!
Solution
Have you first run make? The first step is to actually build the mecab-java binding library.
$ tar -xvzf mecab-java-0.xx.tar.gz
$ cd mecab-java-0.xx
$ make
This will result in the following 2 files being output:
- MeCab.jar
- libMeCab.so
However, this assumes your platform is Linux, you have access to the make program, and that your Java includes are at /usr/local/jdk/include.
If this is not the case, then try reading the Makefile to see if you can build it on your own environment.
OTHER TIPS
For OS X I updated my makefile. I made several changes:
I set the
INCLUDE
to point to the OS X JAVA_HOME/include (using the/usr/libexec/java_home
utility, which I believe is standard on OS X installs)Set the second include path to include
$(INCLUDE)/darwin
instead of/linux
- Change the CXX command to build a dylib instead of a linux .so library using the
-dynamiclib
compiler flag. - I also renamed the library, because the regular mecab lib and the jni wrapper are for some reason built with the same name, and since a default OS X install will be using a case insensitive file system, that could be very problematic. Instead of building
lib$(TARGET).so
I'm buildinglib$(TARGET)Jni.dylib
- I also changed LD_LIBRARY_PATH in the
make test
toDYLIB_FALLBACK_LIBRARY_PATH=.
but I think that probably should work without being changed.
This is what my full makefile looks like.
TARGET=MeCab
JAVAC=javac
JAVA=java
JAR=jar
CXX=c++
INCLUDE=$(shell echo `/usr/libexec/java_home`/include)
PACKAGE=org/chasen/mecab
LIBS=`mecab-config --libs`
INC=`mecab-config --cflags` -I$(INCLUDE) -I$(INCLUDE)/darwin
all:
$(CXX) -O3 -c -fpic $(TARGET)_wrap.cxx $(INC)
$(CXX) -dynamiclib $(TARGET)_wrap.o -o lib$(TARGET)Jni.dylib $(LIBS)
$(JAVAC) $(PACKAGE)/*.java
$(JAVAC) test.java
$(JAR) cfv $(TARGET).jar $(PACKAGE)/*.class
test:
env DYLD_FALLBACK_LIBRARY_PATH=. $(JAVA) test
clean:
rm -fr *.jar *.o *.so *.class $(PACKAGE)/*.class
cleanall:
rm -fr $(TARGET).java *.cxx