Obtain multiple taggings with Stanford POS Tagger

Question 1

OpenNLP allows getting n best for POS tagging:

Some applications need to retrieve the n-best pos tag sequences and not only the best sequence. The topKSequences method is capable of returning the top sequences. It can be called in a similar way as tag.
Sequence topSequences[] = tagger.topKSequences(sent);
Each Sequence object contains one sequence. The sequence can be retrieved via Sequence.getOutcomes() which returns a tags array and Sequence.getProbs() returns the probability array for this sequence.

Also, there is also a way to make spaCy do something like this:

Doc.set_extension('tag_scores', default=None)
Token.set_extension('tag_scores', getter=lambda token: token.doc._.tag_scores[token.i])

class ProbabilityTagger(Tagger):
    def predict(self, docs):
        tokvecs = self.model.tok2vec(docs)
        scores = self.model.softmax(tokvecs)
        guesses = []
        for i, doc_scores in enumerate(scores):
            docs[i]._.tag_scores = doc_scores
            doc_guesses = doc_scores.argmax(axis=1)

            if not isinstance(doc_guesses, numpy.ndarray):
                doc_guesses = doc_guesses.get()
            guesses.append(doc_guesses)
        return guesses, tokvecs


Language.factories['tagger'] = lambda nlp, **cfg: ProbabilityTagger(nlp.vocab, **cfg)

Then each token will have tag_scores with the probabilities for each part of speech from spaCy's tag map.

Source: https://github.com/explosion/spaCy/issues/2087

Question 2

I dont know a tagger which offer several POS interpretation for english phrases (this is for spanish) Other option for you could be change or combine taggers, I mean, using your own example in Freeling I got your expected result

Additionally, you can see Freeling show you also other posible POS interpretation for certain word in its context.

Note: Maybe if you have used Freeling you know that for machine readability you can use the xml output (below your results) and for automatization you can integrate Freeling with python/java but usually I prefer just call it via command line.

Question 3

We found that the default model for POS taggin wasn't good enough. It turned out that using a different model much better tags. We are currently using wsj-0-18-bidirectional-distsim and the performance is good enough for most tasks. I include it like so:

props.put("pos.model",
    "edu/stanford/nlp/models/pos-tagger/wsj-bidirectional/wsj-0-18-bidirectional-distsim.tagger");
props.put("annotators", "tokenize, ssplit, pos, ...");
pipeline = new StanfordCoreNLP(props);