Domanda

I am trying to get the doc frequency for each term in term enum. But I getting everytime only a "1" for the document frequency for all terms. Any hint, what the problem could be? This is my code:

Terms terms = reader.getTermVector(docId, field);
TermsEnum termsEnum = null;
termsEnum = terms.iterator(termsEnum);
BytesRef termText = null;
while((termsEnum.next()) != null){
    int docNumbersWithTerm = termsEnum.docfreq();
    System.out.println(docNumbersWithTerm);
}
È stato utile?

Soluzione

The Terms instance from IndexReader.getTermVector acts as if you have a single-document index, comprised entirely of the document specified. Since there is only one document to consider in this context, you should always get docfreq() = 1. You could generate the docfreq from the full index using the IndexReader.docFreq method:

int docNumbersWithTerm = reader.docFreq(new Term(termsEnum.term(), field));
System.out.println(docNumbersWithTerm);
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top