Question

I am trying to use iText to find out the number of tables present in a pdf file using java, however with no success can anyone let me know the right direction to look in.

Was it helpful?

Solution

If your PDF is tagged, you can inspect the StructTreeRoot for table structures. If your PDF isn't tagged, there are no tables in your PDF. You may see tables with the naked eye, but as far as the PDF file is concerned, there are only lines and snippets of text, no tables!

A PDF that isn't tagged, doesn't know anything about its structure! Extracting tables from a PDF that doesn't contain a StructTreeRoot is as possible as extracting the original full carrots from carrot soup. If that's what you want to do, then hopefully my metaphor explains why you're asking for something that is impossible (which explains why you don't find any answers).

How do you find out if a PDF is Tagged? Open the PDF in Adobe Reader and click File > Document Properties. Somewhere at the bottom of the Properties tab, there's an entry that indicates Tagged PDF: No or Tagged PDF: Yes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top