This should get you started: ExtractingRequestHandler, which integrates Tika into Solr.
Indexing documents with websolr
-
11-03-2022 - |
Question
We're looking at using the Websolr add-on for searching Resources within our Rails app.
The app contains many Resource models. Most of the resource models are self-contained, with a series of attributes: author, title, a set of tags etc, however some of the Resource models have a pdf attached. We need to index the content of this pdf so that it is searchable as a part of the Resource.
How should I approach this?
Solution
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow