문제

I'm using the GATE *SDK* and would like to modify the default ANNIE Gazetteer to include a simple annotation based on a new list definition I have created.

  • I've added my list definition to GATE-HOME\plugins\ANNIE\resources\gazetteer
  • I've added an entry in the lists.def file to point to my new list file. E.g. *open_source_software:opensouce*
  • I've created an annotation schema and added to the GATE-HOME\plugins\ANNIE\resources\schema
  • When i load ANNIE and run the application it does not automatically identify the annotation however when i hover over a word which exists in the new list definition ANNIE highlights the word and suggests the correct annotation

Is it possible to make his automatic so i don't have to train ANNIE? And so i can do it pro-grammatically?

도움이 되었습니까?

해결책

By default the gazetteer creates annotations of type Lookup with majorType and minorType features, for example an entry in the .def file of

oss.lst:software:open_source

would create Lookups with majorType "software" and minorType "open_source" for entries in the list. The usual approach then would be to write JAPE rules that process the Lookup annotations and create the final annotations.

It is possible to create other annotation types directly from the gazetteer, by adding more fields to the .def line:

oss.lst:software:open_source::Software

would create annotations of type Software instead of Lookup (the fields are list file name, major type, minor type, language, and annotation type). But generally I'd recommend sticking with Lookup and then creating your final annotations with JAPE, so you can add additional rules as necessary (the gazetteer blindly annotates any mentions of anything in the list, you often need heuristics to filter this down, for example "Apache" might be considered software most of the time, but not when followed by the word "License").

Finally, if you want to add your own gazetteer lists and/or JAPE rules then we recommend you don't edit the files under plugins/ANNIE directly. Instead create your own lists.def somewhere else, and load that into a separate instance of the gazetteer PR, inserted at the appropriate place in the pipeline.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top