calculate precision and recall in lucene using logger

https://stackoverflow.com/questions/10470554

06-06-2021
|

Question

I used the lucene benchmark in order to measure the precision and recall in the original code there is two files, topics File:

QualityQuery qqs[] = qReader.readQueries( new BufferedReader(new FileReader(topicsFile)));

and qrelsFile:

Judge judge = new TrecJudge(new BufferedReader(new FileReader(qrelsFile)));

These two files are text files as I understand.But I don't Know what I need to full these two file with,are they written manually by me or there is some code to populate them with the needed information.

I need any help with this precision and recall measurement in the lucene program

thanks

Solution

The Javadocs for TrecJudge http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/all/org/apache/lucene/benchmark/quality/trec/TrecJudge.html

gives:

Judge if given document is relevant to given quality query, based on Trec format for judgements.

TREC (http://trec.nist.gov/ ) is a series of conferences that offer competitions for Information Retrieval.

I suspect you may have to do some of your own detective work but this is of interest to me and I may add some more info.

In general the strategy for benchmarking will be something like:

provide a corpus relating to your area of interest
annotate part of it to indicate what should be recalled. This might be two sets - one with the information (positive) and one without (negative)
divide this into two parts - one to train your application and one to test it (there are more sophisticated approaches that require more)
run the evaluation software over your test set.

You will need to provide your format in TREC format, I suspect.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow