Mathematica's TextRecognize not up to par

https://stackoverflow.com/questions/8919253

17-04-2021
|

Question

Please take a look at the screenshot below and see if you can tell me why this won't work. The examples in on the reference page for TextRecognize look pretty impressive, I don't think recognizing single letters like this should be a problem. I've tried resizing the letters as well as having the image sharpened.

For convenience in case you want to try this yourself I have included the image that I use at the bottom of this post. You can also find plenty more like this by searching for "Wordfeud" in Google Image Search.

Mathematica screenshot

Wordfeud board

Solution

Very cool question!

TextRecognize uses heuristics to recognize whole words from the English language. This is the gotcha that makes recognizing single letters very hard

Consider the following line of thought:

s = Import["http://i.stack.imgur.com/JHYuh.png"];
p = ImagePartition[s, 32]

Now pick letters to form the English word 'EXIT':

x = {p[[1, 13]], p[[6, 6]], p[[3, 13]], p[[1, 12]]}

Now clean up these images a bit, like so:

d = ImageAssemble[ Map[ImageTake[#, {3, 27}, {2, 20}] &, x ]];

Then this returns the string "EXIT":

TextRecognize[d]

Mathematica graphics

OTHER TIPS

This is an approach completely different from using TextRecognize, so I am posting this as a separate answer. It uses the same image recognition technique from the How do I find Waldo with Mathematica.

First get the puzzle:

wordfeud = Import["http://i.stack.imgur.com/JHYuh.png"]

Mathematica graphics

And then get the pieces of the puzzle:

Grid[pieces = ImagePartition[s, 32]]

Mathematica graphics

Let's be interested in the letter E:

LetterE = pieces[[4, 3]]

Mathematica graphics

Get the correlation image:

correlation = 
 ImageCorrelate[wordfeud, Binarize[LetterE], 
 NormalizedSquaredEuclideanDistance]

Mathematica graphics

And highlight the matches:

positions = Dilation[ColorNegate[Binarize[correlation, .1]], DiskMatrix[20]];
found = ImageMultiply[wordfeud, ImageAdd[ColorConvert[positions, "GrayLevel"], .5]]

Mathematica graphics

As before, this requires a bit of tuning on binarizing the correlation image, but other than that this should help to identify bits and pieces of this puzzle.

I thought the quality of your image might be interfering. Binarizing your image did not help : recognition was zilch. I also tried a very sharp black and white image of a crossword puzzle solution. (see below) Again, nothing was recognized whether in regular or binarized format.

crossword solution

So I removed the black background leaving only the letters and their thin black frames. Again, recognition was about 0%.

When I removed the frames from around some of the letters AND binarized the image the only parts that were recognizable were those regions in which there was nothing but letters. (see below)

crossword 2

Notice in the output below, ANTS, TIRES, and TEXAS are correctly identified (as well as VECTORS), but just about nothing else.

Notice also that, even though the strings were widely spaced, mma interpreted them as words, rather than separate letters. Note "TEXAS" instead of "T E X A S".

TextRecognize[Binarize@img]

(* output *)
ANTS FFWWW FEEWF
E R o If IU I?
E A FI5F WWWFF 5
5552? L E F F
T s E NTT BT|
H0RWW@0WVlWF;EE F
5 W E   ; OCS
FOFT W W R AL%AE
A TT I T ? _
i iE@W'NF WG%S W
A A EW F I i
SWWTW W ALTFCWD N
H A V 5 A F F
PLATT EWWLIGHT
W N E T
HE TIRES C
TEXAS VECTORS

I didn't have the patience to completely clean up the image. It would have been much faster to retype the text by hand.

Conclusion: Don't use text recognition in mma unless you have absolutely clear text against an even-colored, bright, preferrably white, background.

The results also varied depending on the file format used. Avoid .pdf altogether.

Edit

acl captured and tried to recognize the last 5 lines (above Edit). His results (in a comment below): mostly gibberish.

I decided to do the same. But since Prashant warned that text size makes a difference, I zoomed in first so that the text appear (to my eyes) to be about 20 pica. Below is the picture of the text I scanned and TextRecognized.

text2

Here's the result of an unbinarized TextRecognize (at that large size):

Gliii. Q lk-ii`t`*¥ if EY £\[CloseCurlyDoubleQuote]1\[Euro]'EE \
Di'¥C~E\"P ITF SKI' T»f}!E'!',IL:?E\[CloseCurlyDoubleQuote] I 2 VEEE5\
\[CloseCurlyQuote] LEP \"- \"VE
1. ur e=\\..r.1.»».»\\\\ rw r 1»»\\|a'*r | r .fm -»'-an \
\[OpenCurlyQuote] -.-rr -_.»~|-.'i~-.w~,.-- nv n.w~»-\
\[OpenCurlyDoubleQuote]~"

Now, here's the result for the TextRecognize of the binarized image. The original image was a .png from Jing.

I didn't have the patience to completely clean up the image. It would \
have been much faster to retype the
text by hand.
Conclusion: Don't use text recognition in mma unless you have \
absolutely clear text against an even-
colored, bright, preferrably white, background.
The results also varied depending on the file format used. Avoid .pdf \
altogether.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow