且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

替代Tesseract OCR培训吗?

更新时间:2022-06-22 22:45:51

根据您的评论,您所需要做的就是以几乎100%的准确度扫描相对少量的文档,并且预算约为200美元

Based on your comment, all you need is to scan relatively small amount of documents with almost 100% accuracy and your budget is about 200$

那么,答案很简单.您不需要任何编程解决方案.只需购买优质的商用OCR产品,例如ABBYY FineReader(免责声明:我为ABBYY工作).它在不同地区的价格不同,但我想它在您的预算中.

Well, the answer is simple then. You don't need any programming solution. Just buy quality commercial OCR product, f.e. ABBYY FineReader (disclaimer: I work for ABBYY). It has different prices in different regions, but I guess it is somewhere in about your budget.

商用台式机OCR产品将为您提供开箱即用的典型语言几乎100%的准确性.此外,他们还有方便的手动验证工具来修复所有剩余的错误.通常,它们支持各种各样的现代字体,但是如果您的字体不是很普通的话,它们确实具有字体训练实用程序.

Commercial desktop OCR product will provide you out-of-the box almost 100% accuracy on typical languages. Also they have convenient manual verification tools to fix all remaining errors. Typically they support whole variety of modern fonts, but if your font is not trivial, they do have font training utility for that.

我确实认为这是您的***解决方案.

I do think that is optimal solution for you.

更新:Linux平台. 不幸的是,很遗憾,几乎没有选择适用于Linux的高质量OCR产品.我知道的唯一一个来自ABBYY: http://ocr4linux.com/en:start 但它确实没有用户界面,验证和字体培训.但是至少您可以尝试一下,看看它是否会给您足够的准确性,这可能是事实.

UPDATE: Linux platform. Unfortunately, there is almost no choice of high quality OCR products for Linux, sorry. The only one I know is from ABBYY: http://ocr4linux.com/en:start but it does not have UI, verification and font training. But at least you can give it a try to see if it will give you good enough accuracy as it is, which may happen to be the case.