문제

I'm trying to index PDF files using Apache Lucene 4.4

I keep getting the following exception:

Exception in thread "main" java.lang.NoSuchFieldError: TOKENIZED
at com.snowtide.pdf.lucene.LuceneInterface20.addField(SourceFile:18)
at com.snowtide.pdf.lucene.PDFDocumentFactory.buildPDFDocument(SourceFile:174)
at com.snowtide.pdf.lucene.PDFDocumentFactory.buildPDFDocument(SourceFile:84)
at com.apache.lucene.search.EasyLuceneIntegration.addPDFToIndex(EasyLuceneIntegration.java:134)
at com.apache.lucene.search.EasyLuceneIntegration.main(EasyLuceneIntegration.java:62)

I'm using PDFTextStream and following their example in here: enter link description here

도움이 되었습니까?

해결책

The project you've referenced only supports up to Lucene 2.2. I'd recommend looking into , to get your PDFs into an acceptable format, or you can just use (which, I believe, is the package Tika uses for PDFs).

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top