Ignored XML elements show up near eXist-db's lucene search results

Question

Let's take point of departure in eXist-db's Shakespeare app. Say you have index entries there. You do not want hits in the index terms - this the index configuration takes care of - but you also do not want them output to the KWIC display - this you have to code yourself.

If you look in app.xql, you will see there is a function named app:filter called from app:show-hits. This you can use to remove parts of the output to the KWIC display, based on the name of the parent of the text node that is output.

This will give what you want:

declare %private function app:filter($node as node(), $mode as xs:string) as xs:string? {
    let $ignored-elements := doc('/db/system/config/db/apps/shakespeare/collection.xconf')//*:ignore/@qname/string()
    let $ignored-elements := 
        for $ignored-element in $ignored-elements
        let $ignored-element := substring-after($ignored-element, ':')
        return $ignored-element
    return
        if (local-name($node/parent::*) = ('speaker', 'stage', 'head', $ignored-elements)) 
        then ()
        else 
            if ($mode eq 'before') 
            then concat($node, ' ')
            else concat(' ', $node)
};

You can of course hard-code the elements to ignore, as in ('speaker', 'stage', 'head', 'sic', 'term', 'note') ('index' is not needed here since you must always use 'term'), but I wanted to show that you do not have to. However, if you do not hard-code the elements to ignore, you should certainly move the assignment of $ignored-elements out of the function, for instance to a variable declared in the query prolog, so the database (collection.xconf) does not get called for every text node you encounter: this really is stupid, but I have put in all in one function for the sake of simplicity.

PS: namespace prefixes can be anything you choose, but the standard namespace prefix for the http://www.tei-c.org/ns/1.0 namespace is "tei", and changing it to "teins" can only lead to confusion.