Pregunta

I find myself working with documents that need to be displayed to a user and at the same time, I need the information that they bear.

For example, you have a generated PDF document (or other "visible" format) that represents an invoice. I need to have the document to display it to the user and its information it contains in order to treat it in my system.

OCR and similar techniques are not always certain, have some overhead and as you imagine I don't want to resolve to human input.

Hence my question: Did you ever hear of a document standard (or attempts to build on) that contains, in its meta-data, the information that it displays to the user ?

In the format I imagine, you could have the essential information (invoice number, amount, etc ...), described in advance or not, encoded in the document and accessible progmatically.

¿Fue útil?

Solución

ZUGFeRD is an electronic invoice standard that started as PDF documents with embedded XML data. This would match your question pretty well. (sorry, website is in german with incomplete english information.)

The whole development is quite convoluted, and my impression is that by trying to incorporate different approaches and formats the goal of enabling transmission of invoices with automatically processable data has become more complicated instead of simpler.

Licenciado bajo: CC-BY-SA con atribución
scroll top