문제

How can we group all annotations between two annotations?

I'm new to GATE and am trying to group annotations together , Not sure if we can do this , Please help. For Example In the following text :

Page-1
Age:53 
Person: Nathan

Page-2
Treatment : Initial Evaluation
History: Yes

Page-3
..........

If my Gazetteer list consists of different tags, page tag for each page number, age, person, Treatment, History etc. I want to group all tags from Page-1 to Page-2 under Page-1 Annotation and all tags between Page-2 and Page-3 under Page-2.

Please let me know if more information required on this question.

Thanks in advance.

도움이 되었습니까?

해결책

I'm not entirely sure what you mean by "group together" but you can certainly create annotations that span across the content of each "page". Assuming you have a PageNumber annotation on each "Page-1", "Page-2" etc. then you can use something like this to create annotations spanning from one PageNumber to the next. I'm using a control = once JAPE to do this, you could equivalently use a Groovy script or a custom PR

Imports: { import static gate.Utils.*; }
Phase: PageSpans
Input: PageNumber
Options: control = once

Rule: PageSpan
({PageNumber})
-->
{
  try {
    List<Annotation> numbers = inDocumentOrder(inputAS.get("PageNumber"));
    for(int i = 0; i < numbers.size(); i++) {
      outputAS.add(start(numbers.get(i)), // from start of this PageNumber, to...
                   (i+1 < numbers.size()
                     ? start(numbers.get(i+1)) // start of the next number, or...
                     : end(doc) // ...if no more PageNumbers then end of document
                   ),
                   "Page",
                   // store the text under the PageNumber as a feature of Page
                   featureMap("id", stringFor(doc, numbers.get(i))));
    }
  } catch(InvalidOffsetException e) {
    throw new JapeException("Invalid offset from existing annotation", e);
  }
}

In your comment you ask about moving all the annotations under each "page" into a separate annotation set. This would be relatively straightforward once you have done the above, and if you have the page number as a feature on your Page annotations as I have done with the "id" feature. Then you could define another JAPE that does something like this:

Imports: { import static gate.Utils.*; }
Phase: SetPerPage
Input: Age X Y // and whatever other annotation types you want to copy
Options: control = all

Rule: MoveToPageSet
({Age}|{X}|{Y}):entity
-->
:entity {
  try {
    for(Annotation e : entityAnnots) {
      // find the (only) Page annotation that covers this entity
      Annotation thePage = getOnlyAnn(getCoveringAnnotations(inputAS, e, "Page"));
      // get the corresponding annotation set
      AnnotationSet pageSet = doc.getAnnotations(
              (String)thePage.getFeatures().get("id"));
      // and copy the annotation into it
      pageSet.add(start(e), end(e), e.getType(), e.getFeatures());
    }
  } catch(InvalidOffsetException e) {
    throw new JapeException("Invalid offset from existing annotation", e);
  }
  // optionally remove from input set
  // inputAS.removeAll(entityAnnots);
}
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top