Question

I'm trying to run the CRFClassifier on a string to extract entities from the string. I'm using the Ruby bindings for the Stanford NLP entity recognizer from here: https://github.com/tiendung/ruby-nlp

It works perfectly fine on its own class say (nlp.rb). When I run ruby nlp.rb it works fine. However, I've tried to create an object of this class inside one of my controllers in my rails app and for some reason I'm getting the following error:

java.lang.NoClassDefFoundError: edu/stanford/nlp/ie/crf/CRFClassifier

Here is the code that works fine on its own but not inside a controller.

    def initialize
        Rjb::load('stanford-postagger.jar:stanford-ner.jar', ['-Xmx200m'])
        crfclassifier = Rjb::import('edu.stanford.nlp.ie.crf.CRFClassifier')
        maxentTagger = Rjb::import('edu.stanford.nlp.tagger.maxent.MaxentTagger')
        maxentTagger.init("left3words-wsj-0-18.tagger")
        sentence = Rjb::import('edu.stanford.nlp.ling.Sentence')
        @classifier = crfclassifier.getClassifierNoExceptions("ner-eng-ie.crf-4-conll.ser.gz")


    end


    def get_entities(sentence)
        sent = sentence
        @classifier.testStringInlineXML( sent )

    end

It's the same exact code in both cases. Anyone has any idea of what's happening here!?

Thanks in advance!

Was it helpful?

Solution

I think you need this:

Rjb::load('/path/to/jar/stanford-postagger.jar:/path/to/jar/stanford-ner.jar', ['-Xmx200m'])

I just tried this and it works. Create a dir in lib called nlp. Put the jars there and then create a class which loads the jars using the full path:

So you end up with:

├── lib
│   ├── nlp
│   │   ├── stanford-ner.jar
│   │   └── stanford-postagger.jar
│   └── nlp.rb



require 'rjb'

class NLP
  def initialize
    pos_tagger = File.expand_path('../nlp/stanford-postagger.jar', __FILE__)
    ner = File.expand_path('../nlp/stanford-ner.jar', __FILE__)
    Rjb::load("#{pos_tagger}:#{ner}", ['-Xmx200m'])
    crfclassifier = Rjb::import('edu.stanford.nlp.ie.crf.CRFClassifier')
    maxentTagger = Rjb::import('edu.stanford.nlp.tagger.maxent.MaxentTagger')
    maxentTagger.init("left3words-wsj-0-18.tagger")
    sentence = Rjb::import('edu.stanford.nlp.ling.Sentence')
    @classifier = crfclassifier.getClassifierNoExceptions("ner-eng-ie.crf-4-conll.ser.gz")
  end


  def get_entities(sentence)
    sent = sentence
    @classifier.testStringInlineXML( sent )
  end
end

Little test class:

require_relative 'lib/nlp'

n = NLP.new
n.get_entities("Good afternoon Rajat Raina, how are you today?")

output:

ruby t.rb
Loading classifier from /Users/brendan/code/ruby/ruby-nlp/ner-eng-ie.crf-4-conll.ser.gz ... done [1.2 sec].
Getting data from Good afternoon Rajat Raina, how are you today? (default encoding)
Good afternoon <PERSON>Rajat Raina</PERSON>, how are you today?
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top