Question

I am using Lucene to index the content of my site and provide a search facility. I also use Lucene's MoreLikeThis to generate a "related pages" facility for the site. My site is multi lingual, so I need to limit the MoreLikeThis to a specific language at a time.

Anyone has an idea on how to do this?

Était-ce utile?

La solution 2

I ended up with just splitting into multiple indexes and then perform the MLT query. Otherwise it is too heavy of a request. I hope the Lucene developers will ov

Autres conseils

MoreLikeThis returns a Query object.
MoreLikeThis mlt = new MoreLikeThis(ir);
Reader target = ... // orig source of doc you want to find similarities to
Query query = mlt.like( target);

You could create a 2nd query that checks for language. Then wrap both queries using You could create a BooleanQuery, like so:
BooleanQuery booleanQuery = new BooleanQuery();
booleanQuery.add(MoreLikeThisQuery, BooleanClause.Occur.MUST);
booleanQuery.add(languageQuery, BooleanClause.Occur.MUST);

Not very performance efficient but it will get the job done if you have a small corpus.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top