且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在JAVA中比较不同语言的字符串值?

更新时间:2023-11-05 11:25:46

将所有名称音译为相同的语言(例如英语)以进行搜索,并使用Levenstein编辑距离来计算名称的语音表示之间的相似度.如果仅将查询与每个名称进行比较,但是将数据库中的所有地名预索引到

Transliterate all names into the same language (e.g. English) for searching, and use Levenstein edit distance to compute the similarity between the phonetic representations of the names. This will be slow if you simply compare your query with every name, but if you pre-index all of the place names in your database into a Burkhard-Keller tree, then they can be efficiently searched by edit distance from the query term.

此技术使您可以按名称实际匹配的程度对其进行排序.比起使用变音器或双变音器,您更可能以这种方式找到匹配项,尽管这更难以实现.

This technique allows you to sort names by how close they actually match. You're probably more likely to find a match this way than using metaphone or double-metaphone, though this is more difficult to implement.