سؤال

I've seen this link: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika What I got is pure text without any style from Tika for Solr to search in . Is it possible to have the text with its style from Solr? In other words, we need to show text with its original style after searched by solr .

هل كانت مفيدة؟

المحلول

If you think about it, what is "original style" in a pdf? What components of the "style" do you want to keep?

It's not just font and weight, it's stroke, fill, angle, path, graphics, tracking, transparency, transformations and more. IF you got all that, how would you display it in your UI/Web?

You can't really replicate the original style any way other than displaying the original PDF. So that's the way people usually do it if they want the original formatting.

Otherwise, they just use the pure text.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top