Incremental/Continuous crawl and case sensitivity - Duplicate items
-
02-10-2020 - |
题
I just stumbled upon an interesting behavior:
- Have a document set:
- Title: My Special Document Set
- Perform a crawl (continuous / incremental crawl)
- Search returns 1 result.
- Rename the document set with different casing: "my special docuMENT SET"
- Perform a continuous / incremental crawl
- Search returns 2 results. One for each case (my special... & My Special...)
- The
DocId
managed property is different for both search results - The Duplicate Rows are 0 or in other words: Total Rows: 2, Total Rows including Duplicates: 2, so
trimduplicates=true
does not help, still 2 results.
- When I now perform a full crawl, the duplicate entry is removed.
It seems as continuous and incremental crawl do not take care of casing/renames of folders/FileLeafRef and create a duplicate entry for when only the casing of an item is changed. Besides running a full crawl, is there anything else I could do about this?
We're currently evaluating creating an event receiver to trigger CrawlLog.RecrawlDocument
on renames but that seems kind of non-standard and weird. The SPListItem.UniqueItemId
is the same for the item so we could actually filter the result set. - weird that duplicate detection doesn't kick in
解决方案
This is a SharePoint Bug. It seems to be fixed in the October 2014 CU, I couldn't find any details in any KB article though.
I installed the May 2015 CU on a SharePoint 2013 SP1 patched server and the problem is gone.