I'm not entirely sure what you mean by "group together" but you can certainly create annotations that span across the content of each "page". Assuming you have a PageNumber
annotation on each "Page-1", "Page-2" etc. then you can use something like this to create annotations spanning from one PageNumber
to the next. I'm using a control = once
JAPE to do this, you could equivalently use a Groovy script or a custom PR
Imports: { import static gate.Utils.*; }
Phase: PageSpans
Input: PageNumber
Options: control = once
Rule: PageSpan
({PageNumber})
-->
{
try {
List<Annotation> numbers = inDocumentOrder(inputAS.get("PageNumber"));
for(int i = 0; i < numbers.size(); i++) {
outputAS.add(start(numbers.get(i)), // from start of this PageNumber, to...
(i+1 < numbers.size()
? start(numbers.get(i+1)) // start of the next number, or...
: end(doc) // ...if no more PageNumbers then end of document
),
"Page",
// store the text under the PageNumber as a feature of Page
featureMap("id", stringFor(doc, numbers.get(i))));
}
} catch(InvalidOffsetException e) {
throw new JapeException("Invalid offset from existing annotation", e);
}
}
In your comment you ask about moving all the annotations under each "page" into a separate annotation set. This would be relatively straightforward once you have done the above, and if you have the page number as a feature on your Page
annotations as I have done with the "id" feature. Then you could define another JAPE that does something like this:
Imports: { import static gate.Utils.*; }
Phase: SetPerPage
Input: Age X Y // and whatever other annotation types you want to copy
Options: control = all
Rule: MoveToPageSet
({Age}|{X}|{Y}):entity
-->
:entity {
try {
for(Annotation e : entityAnnots) {
// find the (only) Page annotation that covers this entity
Annotation thePage = getOnlyAnn(getCoveringAnnotations(inputAS, e, "Page"));
// get the corresponding annotation set
AnnotationSet pageSet = doc.getAnnotations(
(String)thePage.getFeatures().get("id"));
// and copy the annotation into it
pageSet.add(start(e), end(e), e.getType(), e.getFeatures());
}
} catch(InvalidOffsetException e) {
throw new JapeException("Invalid offset from existing annotation", e);
}
// optionally remove from input set
// inputAS.removeAll(entityAnnots);
}