Question

I am using POS tagger for a project and it works successfully when it reads the tagger file from my computer (project's folder). But I need to upload the tagger file first and read the tagger file from a URL. To do so, I have uploaded the POS tagger file and I am trying to read the tagger file by giving the URL to the constructor of the MaxentTagger method: (my code is in C# and I have overridden the MaxentTagger class so it's constructor looks like this:

public Tagger () {

java.io.ByteArrayInputStream inputStream = new java.io.ByteArrayInputStream(System.IO.File.ReadAllBytes(@"C:\models\english-left3words-distsim.tagger"));

base.readModelAndInit(null, new java.io.DataInputStream(inputStream), false); }

However I get this error when I run my code:

"An unhandled exception of type 'java.lang.RuntimeException' occurred in stanford-postagger.dll

Additional information: java.io.FileNotFoundException: Could not find a part of the path 'C:\u\nlp\data\pos_tags_are_useless\egw4-reut.512.clusters'."

Does anybody know why this happens and how I can resolve this? I appreciate any sort of help very much!

Was it helpful?

Solution

This error comes from the program trying to load a file which gives the distributional similarity mapping from words to clusters. It's trying to get it from the location that is specified in the training properties file (and you naturally don't have a file at that location). This happened because you don't have a properly initialized TaggerConfig object at the time readModelAndInit() is called. The way it gets initialized is unintuitive (was badly architected), but you're only encountering this because you're trying to use a non-public API.

Why can't you just use the public API as follows?

MaxentTagger base = new MaxentTagger("http://my.url.com/models/english-left3words-distsim.tagger");
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top