There are three issues I can see here.
- First, as ashingel says, spaces are not represented as
Token
annotations - this is deliberate as in most cases you don't care about the spacing between words, only the words themselves. - Second, the trailing
({Token.kind=="word"})
means that the rule will only match when "Psychological Evaluation" is followed by another word before the end of the current sentence (because you've gotSplit
in the Input line). - Third, you're only binding the
Meddoc
label to the "Evaluation" token, not to the whole match.
I would try and simplify the LHS of the rule:
Phase: ConjunctionIdentifier
Input: Token Split
Rule: Medicalrule
(
{Token.string=="Psychological"}
{Token.string == "Evaluation"}
):meddoc
and for the RHS (a) you don't need to do the explicit bindings.get
because you've used a labelled block so you already have the bound annots available, (b) you should use outputAS
instead of annotations
, and (c) you should generally avoid the add
method that takes nodes, as it isn't safe if the input and output annotation sets are different. If you're using a recent snapshot of GATE then the gate.Utils
static methods can help you a lot here
:meddoc {
Utils.addAnn(outputAS, meddocAnnots,"CC",
Utils.featureMap("rule","Medicalrule"));
}
If you're using 7.1 or earlier then the addAnn
method isn't available so it's slightly more convoluted:
:meddoc {
try {
outputAS.add(Utils.start(meddocAnnots), Utils.end(meddocAnnots),"CC",
Utils.featureMap("rule","Medicalrule"));
} catch(InvalidOffsetException e) { // can't happen, but won't compile without
throw new JapeException(e);
}
}
Finally, just to check, you did definitely add your new JAPE Transducer PR to the end of the pipeline?