且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Tesseract OCR 如何改善结果?

更新时间:2022-06-22 22:46:15

Tesseract 4.00alpha 与您的图像的输出是

The output of Tesseract 4.00alpha with your image is

$ tesseract ICKcj.png - -l eng
*: 4606 Y; 4809 Z; 698

Warning. Invalid resolution 0 dpi. Using 70 instead.

将图片重新采样为 50% 并将 dpi 设置为 300:

Resample the picture to 50% and setting the dpi to 300:

这个图像的输出稍微好一点,警告消失了:

The output with this image is slightly better and the warning is vanishing:

$ tesseract ICKcj-50.png - -l eng
X: 4606 Y: 4809 Z: 698

唯一缺少的是减号,它们打印的非常不规则(图片中更好的分辨率可能会有所帮助).也可以在 tesseract 中限制输出模式.或者,您可以尝试根据 X、Y、Z 和数字之间的空格来猜测减号.

The only thing missing are the minus signs, which are printed quite irregular (a better resolution in the picture could help). It is also possible to restrict the output pattern in tesseract. Alternatively, you can try to guess the minus afterwards depending on the spaces between the X, Y, Z and the numbers.