Question

I am trying to use Stanford POS Tagger in NLTK but I am not able to run the example code given here http://www.nltk.org/api/nltk.tag.html#module-nltk.tag.stanford

import nltk
from nltk.tag.stanford import POSTagger
st = POSTagger(r'english-bidirectional-distim.tagger',r'D:/stanford-postagger/stanford-postagger.jar')
st.tag('What is the airspeed of an unladen swallow?'.split())

I have already added environment variables as

CLASSPATH = D:/stanford-postagger/stanford-postagger.jar
STANFORD_MODELS =  D:/stanford-postagger/models/

Here is the error I keep getting

Traceback (most recent call last):

File "D:\pos_stanford.py", line 4, in <module>
    st = POSTagger(r'english-bidirectional-distim.tagger',
         r'D:/stanford-postagger/stanford-postagger.jar')  
... LookupError: NLTK was unable to find the english-bidirectional-distim.tagger file! Use software specific configuration paramaters or set the STANFORD_MODELS environment variable.

Some forums suggest that

File "C:\Python27\lib\site-packages\nltk\tag\stanford.py", line 45, in __init__
env_vars=('STANFORD_MODELS'), verbose=verbose)

should be changed so that there is a comma in

env_vars=('STANFORD_MODELS',), verbose=verbose)

but it doesn't solve the problem either. Please Help me in solving this issue.

Other Information: I am using Windows 7 64 bit Python 2.7 32 bit NLTK 2.0

Was it helpful?

Solution

Note : Just posting it as answer to help in case others face this issue in future

I finally found out what I did wrong.. it turned out to be a blunder.

Tagger file name is not 'english-bidirectional-distim.tagger' but 'english-bidirectional-distsim.tagger'.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top