Indexing pdf documents

Pergunta

What the best way to index pdf documents? Should I index them by converting pdf documents to txt or there is a better way to index pdf files?

Solução

Assuming you're talking about solr: see the ExtractingRequestHandler.

Não afiliado a StackOverflow