I have an exceptions file which breaks the functionality of the ignore_chars directive.
The example keyword I am working with is t-shirt
.
t-shirt
appears in the database. I need the ignore_chars directive to ignore the -
so users can search like tshirt
or t-shirt
and get the same results.
The result of CALL KEYWORDS('tshirt t-shirt', 'catalog')
here is
+-----------+------------+
| tokenized | normalized |
+-----------+------------+
| tshirt | TXRT |
| tshirt | TXRT |
+-----------+------------+
To get t shirt
to map to the above results, I have created an exceptions file which looks like this:
t shirt > tshirt
When I do the query CALL KEYWORDS('t shirt tshirt t-shirt', 'catalog')
this is what I get:
+-----------+------------+
| tokenized | normalized |
+-----------+------------+
| tshirt | TXRT |
| tshirt | TXRT |
| shirt | XRT |
+-----------+------------+
What I expected to happen was the exceptions file would rewrite the 'words' t shirt
to the individual keyword tshirt
and all 3 tokens would have the same normalized value.
Except now the -
in the t-shirt
keyword isn't ignored and it just maps to shirt
, which results in a completely different normalized version than tshirt
. On top of this, searching with any of the related keywords above returns 0 results.
When I take out the exceptions file, the ignore_chars work fine and search works again for the keywords.