I ended up with just splitting into multiple indexes and then perform the MLT query. Otherwise it is too heavy of a request. I hope the Lucene developers will ov
Limiting MoreLikeThis of Lucene to a subset of my documents
-
01-12-2021 - |
Question
I am using Lucene to index the content of my site and provide a search facility. I also use Lucene's MoreLikeThis to generate a "related pages" facility for the site. My site is multi lingual, so I need to limit the MoreLikeThis to a specific language at a time.
Anyone has an idea on how to do this?
Solution 2
OTHER TIPS
MoreLikeThis returns a Query object.
MoreLikeThis mlt = new MoreLikeThis(ir);
Reader target = ... // orig source of doc you want to find similarities to
Query query = mlt.like( target);
You could create a 2nd query that checks for language. Then wrap both queries using You could create a BooleanQuery, like so:
BooleanQuery booleanQuery = new BooleanQuery();
booleanQuery.add(MoreLikeThisQuery, BooleanClause.Occur.MUST);
booleanQuery.add(languageQuery, BooleanClause.Occur.MUST);
Not very performance efficient but it will get the job done if you have a small corpus.