In the NLTK, how to interface to Boxer?

https://stackoverflow.com/questions/14539609

05-03-2022
|

Frage

I want to be able to use Boxer as a semantic extractor inside NLTK.

I am testing with the following code:

#!/bin/env python
import nltk
x = nltk.sem.boxer.Boxer()
x.interpret("The capital of Spain is Madrid .")

The failure is the following:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    File "/usr/lib/python2.7/site-packages/nltk/sem/boxer.py", line 83, in interpret
        d, = self.batch_interpret_multisentence([[input]], discourse_ids, question, verbose)
      File "/usr/lib/python2.7/site-packages/nltk/sem/boxer.py", line 140, in batch_interpret_multisentence
          drs_dict = self._parse_to_drs_dict(boxer_out, use_disc_id)
            File "/usr/lib/python2.7/site-packages/nltk/sem/boxer.py", line 241, in _parse_to_drs_dict
                line = lines[i]
                IndexError: list index out of range

From the nltk code, I found at http://nltk.org/_modules/nltk/sem/boxer.html#Boxer that in the _parse_to_drs_dict(self, boxer_out, use_disc_id) function, it does a i += 4 that I haven't been able to understand.

Am I feeding something bad to the Boxer?

Did anyone manage to make it work?

Manually debugging step-by-step, the NLTK actually gets the output from candc and boxer.

Lösung

It seems that the newer version available in GitHub works seamlessly.

In the 2.0.4 code the i += 4 line is probably a bug.

In order to get NLTK working, download the source code from GitHub and python setup.py install it.

Be sure to set CANDCHOME variable to the bin/ dir of your candc and boxer tools, and the models at the previous folder (the path should be $CANDCHOME/../models).

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow