Search in Single-Token-Field using Lucene.NET

https://stackoverflow.com/questions/19836528

25-07-2022
|

Question

I´m using Lucene.NET 3.0.3 for indexing the content of word-, excel-, etc. documents and some custom fields for each document.
If I index a field named "title" as Field.Index.NOT_ANALYZED the Lucene-Index stored the field in correct form. The hole title is stored in a single token. That´s what I want.

e.g. title of document is "Lorem ipsum dolor"
field in Lucene-index: "Lorem ipsum dolor"

If I search using exact search in this field I get no results.
My searchterm looks like: title:"Lorem ipsum dolor"
For searching i´m use the same StandardAnalzer.

Why I can´t find the document?

Solution

StandardAnalyzer is sensitive to whitespace, among other delimiters. That is, it tokenizes the search term into three tokens:

( Lorem, ipsum, dolor )

But you indexed field title using Field.Index.NOT_ANALYZED so none of the three tokens above can match the single token in this field:

( Lorem ipsum dolor )

Use KeywordAnalyzer, which tokenizes the entire field value as a single token. As always, you need to use the same analyzer for both indexing and searching.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow