且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何检测语言

更新时间:2023-02-26 13:52:39

根据您的工作,您可能需要查看python Natural Language Processing Toolkit(NLTK),其中包含一些支持贝叶斯学习算法。

Depending on what you're doing, you might want to check out the python Natural Language Processing Toolkit (NLTK), which has some support for Bayesian Learning Algorithms.

通常,字母和单词的频率可能是最快的评估,但是NLTK(或贝叶斯学习算法)可能会更快如果您需要做识别语言以外的任何事情,将非常有用。如果您发现前两种方法的错误率过高,贝叶斯方法也可能很有用。

In general, the letter and word frequencies would probably be the fastest evaluation, but the NLTK (or a bayesian learning algorithm in general) will probably be useful if you need to do anything beyond identification of the language. Bayesian methods will probably be useful also if you discover the first two methods have too high of an error rate.