Question

I've searched around for open source OCR for Chinese. But without any luck there rarely seems to be some open source OCR (for Chinese) that are usable.

So I am here wondering:

  1. Is there any open source OCR for Chinese that could be used for production environment?

  2. What's the main differences when implementing an OCR for Latin-languages and for Chinese? I know some good OCR such as Tesseract or Ocropus, what should I do if I want to make it support Chinese?

Any help is appreciated and thanks in advance~

Was it helpful?

Solution

You can choose:

  • Tesseract 3.0 support chinese/japanese
  • NHOCR support japanese

OTHER TIPS

Chinese has far more characters than Latin languages. There are some commercial products. One of the ways is to contact them and get help.

I don't think there is an open source for Chinese or Japanese characters. In the area of OCR, there are a lot of techniques beyond the pattern recognition algorithms, where a company is good at, not the open source community.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top