Question

When parsing and tagging the words with OpenNLP i was wondering if the tags (eg S, NP, VP, ADJP) actually mean, i found a few by researching the web, but some of them are still missing, which i am unable to find, currently my code outputs this:

The movie was really good

\-S - S
 |-NP - {Unknown}
 |  |-DT - Determiner
 |  | \- The - The
 |  \- NN - Noun, Singular or mass
 |    \- movie - movie
 \-VP - {Unknown}
   |-VBD - Verb, past tense
   | \- was - was
   \- ADJP - {Unknown}
      |-RB - Adverb
      | \-really - really
      \-JJ - Adjective
        \- good - good

As you can see I have managed to map some of them such as NN as "Noun, Singular or mass" but i am unable to find any references to S, NP, VP, ADJP

Thanks in advance.

Was it helpful?

Solution

The tags are part-of-speech tags or syntactic categories.

  • S : sentence
  • NP : noun phrase
  • VP : verb phrase
  • ADJP : adjective phrase

Here is a list of tags used in the Penn Treebank which is the corpus OpenNLP uses. Different projects use different abbreviations for parts of speech. Some projects use NP for a noun phrase, others NNP.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top