Question

I am trying to use speech tagging in NLTK and have used this command:

>>> text = nltk.word_tokenize("And now for something completely different")

>>> nltk.pos_tag(text)

Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
nltk.pos_tag(text)
File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 99, in pos_tag
tagger = load(_POS_TAGGER)
File "C:\Python27\lib\site-packages\nltk\data.py", line 605, in load
resource_val = pickle.load(_open(resource_url))
File "C:\Python27\lib\site-packages\nltk\data.py", line 686, in _open
return find(path).open()
File "C:\Python27\lib\site-packages\nltk\data.py", line 467, in find
raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not
found.  Please use the NLTK Downloader to obtain the resource:

However, I get an error message which shows:

engish.pickle not found.

I have download the whole corpora and the english.pickle file is there in the maxtent_treebank_pos_tagger

What can I do to get this to work?

Was it helpful?

Solution

Your Python installation is not able to reach maxent or treemap.

First, check if the tagger is indeed there: Start Python from the command line.

>>> import nltk

Then you can check using

>>> dir (nltk)

Look through the list to see if maxent and treebank are both there.

Easier would be to type

>>> "maxent" in dir(nltk)
>>> True
>>> "treebank" in dir(nltk)
>>> True

Use nltk.download() --> Models tab and check to see if the treemap tagger shows as installed. You should also try downloading the tagger again.

NLTK Downloader, Models Tab

OTHER TIPS

If you don't want to use the downloader gui, you can just use the following commands in a python or ipython shell:

import nltk
nltk.download('punkt')
nltk.download('maxent_treebank_pos_tagger')

Over 50 corpora and lexical resources such as WordNet: http://www.nltk.org/nltk_data/ for free. Use http://nltk.github.com/nltk_data/ as server index instead of googlecode Google code 401: Authorization Required

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top