Question

I'm indexing PDFs with Solr using the ExtractingRequestHandler. I would like to display the page number along with hits in a document, e.g. "term foo was found in bar.pdf on pages 2, 3 and 5."

Is it possible to include page numbers in the query result like this?

Was it helpful?

Solution

It would require some development effort, but you could achieve this by indexing each page of each document as a seperate Solr document, and then use field collapsing to group the different page hits for each document.

Note that you need a nightly for this, field collapsing is not implemented in any currently released Solr version.

Also note: Field Collapsing is implemented in version Solr 3.3. More updates are expected in the next big version ( Solr 4.0)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top