I have created a console application. Added a reference to tessnet2_32.

Ocr ocr = new Ocr();
using (Bitmap bmp = new Bitmap(filename))
{
    tessnet2.Tesseract tessocr = new tessnet2.Tesseract();
    tessocr.Init(@"C:\temp\tessdata", "eng", false);
...

I also tried changing "C:\temp\tessdata" to

C:\work\ConsoleApplication3\ConsoleApplication3
C:\work\ConsoleApplication3\ConsoleApplication3\tessdata
C:\work\ConsoleApplication3\ConsoleApplication3\bin\debug
C:\work\ConsoleApplication3\ConsoleApplication3\bin
C:\work\ConsoleApplication3\ConsoleApplication3\bin\debug\tessdata
C:\work\ConsoleApplication3\ConsoleApplication3\bin\tessdata
C:\work\ConsoleApplication3\ConsoleApplication3\debug\tessdata
C:\work\ConsoleApplication3\tessdata
C:\work\ConsoleApplication3\

The tessdata folder itself contained 9 failed and was added to all of these locations:

eng.cube.bigrams
eng.cube.fold
eng.cube.lm
eng.cube.bigrams
eng.cube.params
eng.cube.size
eng.cube.word-freq
eng.tesseract_cube.nn
eng.traineddata

But it just always exists at that .Init line with a message:

The file 'z:\dev\interne\cs\tesseract-ocr-svn\dotnet\tessnet2.cpp' does not exist.

I cannot imagine why it is trying to access some Z disk while I only have C. Or I just completely misunderstand the error.

Can someone be kind enough to post step by step telling what to do and/or what I am doing wrong? I feel completely lost even after reading 30+ google links.

有帮助吗?

解决方案

You use the wrong version of language data file; what you have is for Tesseract 3.0x. tessnet2 is .NET wrapper for Tesseract 2.04, so you will need to load compatible data file.

Try download tesseract-2.00.eng.tar.gz from https://sourceforge.net/projects/tesseract-ocr-alt/files/.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top