Question

Assume I have a UIMA toolchain that does something like this:

tokenize -> POS tagging -> assign my custom tags/annotations -> use the custom tags to assign more tags -> further processing.

Would it be possible tou use a third party, let's say entity-recognition (that uses POS tags but does not need much more), right after the POS-tagging, in between the two custom things or afterwards?

I'm asking this questions because I can see complications due to the type systems. In particular the most difficult case may be pluggin a thrid party ER annotator in between the custom things or right after them. The third party annotator won't expect our custom tags to be there.

However, there are just additional annotations that have to be "passed through" the annotator without looking at them or modifying them. So, in principle, I'd assume that this is possible. I just don't know if UIMA supports this or is all about writing full chains on your own with strict typing everywhere.

If this isn't possible out of the box, could we write the custom annotators in a way such that they can be plugged anywhere where POS tags are available independent from if there are other annotations present. I.e. as authors of annotators take care that there may be some necessary annotations, some annotations we add and any number of annotations that may be present or not and we do not care about them and only pass them through?

Was it helpful?

Solution

The third party annotator won't expect our custom tags to be there.

If I understand correctly, you are concerned that your custom annotations might collide with the third-party NER, right? It won't, unless your code adds exactly the same annotations.

This is the strength of UIMA: every Analysis Engine (AE) is independent of the others, it only cares about the annotations that are passed in the CAS. For example, say you have an AE that expects annotation of type my.namespace.Token. It doesn't matter which AE created these annotations, as long as there are present in the CAS.

The price to pay for this flexibility is that you (as a developer) have to make sure that the required annotation for each AE are present. For example, if an AE expects annotations of type my.namespace.Sentence but none are present, this AE won't be able to do any processing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top