I think you need this:
Rjb::load('/path/to/jar/stanford-postagger.jar:/path/to/jar/stanford-ner.jar', ['-Xmx200m'])
I just tried this and it works. Create a dir in lib called nlp. Put the jars there and then create a class which loads the jars using the full path:
So you end up with:
├── lib
│ ├── nlp
│ │ ├── stanford-ner.jar
│ │ └── stanford-postagger.jar
│ └── nlp.rb
require 'rjb'
class NLP
def initialize
pos_tagger = File.expand_path('../nlp/stanford-postagger.jar', __FILE__)
ner = File.expand_path('../nlp/stanford-ner.jar', __FILE__)
Rjb::load("#{pos_tagger}:#{ner}", ['-Xmx200m'])
crfclassifier = Rjb::import('edu.stanford.nlp.ie.crf.CRFClassifier')
maxentTagger = Rjb::import('edu.stanford.nlp.tagger.maxent.MaxentTagger')
maxentTagger.init("left3words-wsj-0-18.tagger")
sentence = Rjb::import('edu.stanford.nlp.ling.Sentence')
@classifier = crfclassifier.getClassifierNoExceptions("ner-eng-ie.crf-4-conll.ser.gz")
end
def get_entities(sentence)
sent = sentence
@classifier.testStringInlineXML( sent )
end
end
Little test class:
require_relative 'lib/nlp'
n = NLP.new
n.get_entities("Good afternoon Rajat Raina, how are you today?")
output:
ruby t.rb
Loading classifier from /Users/brendan/code/ruby/ruby-nlp/ner-eng-ie.crf-4-conll.ser.gz ... done [1.2 sec].
Getting data from Good afternoon Rajat Raina, how are you today? (default encoding)
Good afternoon <PERSON>Rajat Raina</PERSON>, how are you today?