Question

I'm trying to use MeCab (http://mecab.sourceforge.net/#download) to do the word segmentation of Japanese sentences as well to tag every word by part of speech. I installed MeCab by following these instructions http://mecab.sourceforge.net/#install-unix. Since I don't want to write shell scripts to process 150,000 sentences (as my Mac OS X Terminal have problems showing Japanese characters), I'm using existing binding for Java: http://sourceforge.net/projects/mecab/files/mecab-java/0.98pre3/. At this point I'm trying to compile and run the given test.java file:

import org.chasen.mecab.MeCab;
import org.chasen.mecab.Tagger;
import org.chasen.mecab.Node;

public class test {
  static {
    try {
       System.loadLibrary("MeCab");
    } catch (UnsatisfiedLinkError e) {
       System.err.println("Cannot load the example native code.\nMake sure your LD_LIBRARY_PATH contains \'.\'\n" + e);
       System.exit(1);
    }
  }

  public static void main(String[] argv) {
     System.out.println(MeCab.VERSION);
     Tagger tagger = new Tagger();
     String str = "太郎は二郎にこの本を渡した。";
     System.out.println(tagger.parse(str));
     Node node = tagger.parseToNode(str);
     for (;node != null; node = node.getNext()) {
    System.out.println(node.getSurface() + "\t" + node.getFeature());
     }
     System.out.println ("EOS\n");
  }
}

Here's the README:

1. Build UTF-8 dictionary

2. How to use?

  See test.java as sample program.

  % java -classpath MeCab.jar test -d ../dic

I compile: javac test.java. Then I run: java -classpath MeCab.jar test -d ../dic. The result is the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: //
Caused by: java.lang.ClassNotFoundException: ..
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)

I don't really understand the hierarchy of this mecab-java-0.98pre3 directory, so don't see how to actually compile and run this test.java. Any ideas, guys? Thanks!

Was it helpful?

Solution

Have you first run make? The first step is to actually build the mecab-java binding library.

$ tar -xvzf mecab-java-0.xx.tar.gz
$ cd mecab-java-0.xx
$ make

This will result in the following 2 files being output:

  1. MeCab.jar
  2. libMeCab.so

However, this assumes your platform is Linux, you have access to the make program, and that your Java includes are at /usr/local/jdk/include.

If this is not the case, then try reading the Makefile to see if you can build it on your own environment.

OTHER TIPS

For OS X I updated my makefile. I made several changes:

  1. I set the INCLUDE to point to the OS X JAVA_HOME/include (using the /usr/libexec/java_home utility, which I believe is standard on OS X installs)

  2. Set the second include path to include $(INCLUDE)/darwin instead of /linux

  3. Change the CXX command to build a dylib instead of a linux .so library using the -dynamiclib compiler flag.
  4. I also renamed the library, because the regular mecab lib and the jni wrapper are for some reason built with the same name, and since a default OS X install will be using a case insensitive file system, that could be very problematic. Instead of building lib$(TARGET).so I'm building lib$(TARGET)Jni.dylib
  5. I also changed LD_LIBRARY_PATH in the make test to DYLIB_FALLBACK_LIBRARY_PATH=. but I think that probably should work without being changed.

This is what my full makefile looks like.

TARGET=MeCab
JAVAC=javac
JAVA=java
JAR=jar
CXX=c++
INCLUDE=$(shell echo `/usr/libexec/java_home`/include)

PACKAGE=org/chasen/mecab

LIBS=`mecab-config --libs`
INC=`mecab-config --cflags` -I$(INCLUDE) -I$(INCLUDE)/darwin

all:
    $(CXX) -O3 -c -fpic $(TARGET)_wrap.cxx  $(INC)
    $(CXX) -dynamiclib  $(TARGET)_wrap.o -o lib$(TARGET)Jni.dylib $(LIBS)
    $(JAVAC) $(PACKAGE)/*.java
    $(JAVAC) test.java
    $(JAR) cfv $(TARGET).jar $(PACKAGE)/*.class

test:
    env DYLD_FALLBACK_LIBRARY_PATH=. $(JAVA) test

clean:
    rm -fr *.jar *.o *.so *.class $(PACKAGE)/*.class

cleanall:
    rm -fr $(TARGET).java *.cxx
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top