Does anyone know of a good quick and dirty text / grammar parser?
-
29-09-2019 - |
Question
I have a "mad lib" scenario in which I want to
a) determine the parts of speech of every (or most) words in a sentence
b) have the user select alternatives to those words - or replace them computationally with equivalent words
I looked at the Stanford parser but its a bit slow ... any suggestions?
Solution
Use a POS tagger
If you're just using the part-of-speech (POS) tags and not the parse trees, you don't actually need to use a parser. Instead, you can just use a standalone POS tagger.
POS tagging is much faster than phrase-structure parsing. On a Xeon E5520, the Stanford POS tagger can tag 1700 sentences in 3 seconds, while the same data takes about 10 minutes to parse using the Stanford Parser (Cer et al. 2010).
There's a fairly comprehensive list of other POS taggers here.
OTHER TIPS
For a toolkit approach, there's the NLTK toolkit. It is in Python, so like-for-like speed might not be quite what you want; but being a toolkit intended for teaching, there are a lot of different approaches that can be implemented. Ie. it might be easy to implement a quick parser/tagger even though the underlying language might not be the fastest available.