Question

I used the lucene benchmark in order to measure the precision and recall in the original code there is two files, topics File:

QualityQuery qqs[] = qReader.readQueries( new BufferedReader(new FileReader(topicsFile)));

and qrelsFile:

Judge judge = new TrecJudge(new BufferedReader(new FileReader(qrelsFile)));

These two files are text files as I understand.But I don't Know what I need to full these two file with,are they written manually by me or there is some code to populate them with the needed information.

I need any help with this precision and recall measurement in the lucene program

thanks

Was it helpful?

Solution

The Javadocs for TrecJudge http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/all/org/apache/lucene/benchmark/quality/trec/TrecJudge.html

gives:

Judge if given document is relevant to given quality query, based on Trec format for judgements.

TREC (http://trec.nist.gov/ ) is a series of conferences that offer competitions for Information Retrieval.

I suspect you may have to do some of your own detective work but this is of interest to me and I may add some more info.

In general the strategy for benchmarking will be something like:

  • provide a corpus relating to your area of interest
  • annotate part of it to indicate what should be recalled. This might be two sets - one with the information (positive) and one without (negative)
  • divide this into two parts - one to train your application and one to test it (there are more sophisticated approaches that require more)
  • run the evaluation software over your test set.

You will need to provide your format in TREC format, I suspect.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top