Question

I am creating a desktop/winform application that reads tif/pdf payable invoices and extract all the invoice information to store into database.

I can read the standard barcodes(QR Code, Code39 etc), and some of the payable invoice' standard fields(Invoice Date, Company Name, Address) with OCR (ocr specific region of image) but unable to capture Line items, amount correctly.

I extract information in two phases:
1. Read specific regions based on the template(user mapped region for specific fields)
2. OCR whole page and search for payable invoice standard field names and values

I have idea about following 3 approaches:
1. Create a Template for one type of Invoice and process all invoices.
2. Nural network based engine which need to be trained with sample data to work it based on patterns.
3. Form processing, a kind of OMR. The OCR to look at exact same coordinates where fields were placed on form(during form desing)

Question:
How to extact payable invoice using OCR or some inteligent reader?
Primarily I look for some algorithem (C# + OCR engine)/ philoshpy of payable invoice capturing but reference to some SDK with same feature or solid kind of commercial product would be helpfull too.

I googled and found Abbyy FlexiCapture Engine, IRIS Capture & Extract somewhat promissing but mostly are based on templates, or training. They claim that no template or training required but nothing looks 100 auto capture.

Kindly refere some product (at least with free trial), SDK or Example/sample.

Was it helpful?

Solution 2

I did R&D and concluded that: There is no specialized SDK for invoice Capture that can automate it 95-100%. Only there are OCR/ICR and Imaging SDKs which can help to convert images into text/readable documents but rest of capturing/data extraction is solely based on custom search algorithems( as ilya-evdokimov mentioned above, you need mixup of steps(zonal ocr, full text ocr and then intelligent data extraction). I had studied some of very popluar products but they just claim auto capture but ultimately they just pul the standard invoice fields automatically but rest of work is same either zonal ocr or manual. This is what I suggest but there are many more improvement depending on the natuture applicaiton:

  1. Store the key field(e.g VAT# information for customers in database/xml file
  2. Do full page OCR, find key field, match to the customers list and identitfy/classify the type of document/image.
  3. Once document type (Invoice payable/recieveable etc) is identified then look for standard fields
  4. Allow user to create pre-defined templates for each type of document for each company(the sender of invoices).
  5. Compare the results of both algo (full text ocr and zonal) keep the one with better accuracy.

OTHER TIPS

Of course, by 2018 the situation improved a bit. Let me recapitulate the main approaches today:

  • Still a raw OCR engine (tesseract, Abbyy, Google OCR etc.) and regexes (this may still work just fine for some very limited use-cases)
  • Abbyy FlexiCapture Engine - still going strong, but still based on templates, if you are willing to define one new template for each specific invoice format
  • Rossum Elis (invoices), TagGun (receipts), ... - APIs based on pre-trained machine learning models, i.e. usable and working immediately, with free monthly volumes
  • LucidTech, Itemize, ... - less accessible APIs with a similar functionality (you need to go through a demo and sales process)
  • Datamolino, CloudFactory, ... - APIs with humans behind the scenes performing the data transcription manually (different latency, pricing and accuracy structure)

After more R&D (*) now there now actually are specialized SDKs with APIs:

First - for starters, there is demo at https://rossum.ai/developers

The whole extraction process can now be automated with API (https://docs.api.rossum.ai/) like this:

to upload an invoice:

invoice_file=$1
endpoint='https://all.rir.rossum.ai'
curl -H "Authorization: secret_key $ELIS_API_KEY" -X POST -F file="@$invoice_file;type=application/pdf" $endpoint/document

to download the results:

invoice_id=$1
endpoint='https://all.rir.rossum.ai'
curl -H "Authorization: secret_key $ELIS_API_KEY" $endpoint/document/$invoice_id

These bash examples are from https://github.com/rossumai/elis-client-examples/

(* to add, the API is a direct consequence of my own R&D work in the company ;) )

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top