如何确定字符是否是汉字

更新时间：2023-01-31 07:37:20

有关Ruby中编码的有趣的文章： a href =http://blog.grayproductions.net/articles/bytes_and_characters_in_ruby_18 =nofollow noreferrer> http://blog.grayproductions.net/articles/bytes_and_characters_in_ruby_18 （它是系列的一部分 - 检查文章开头的目录也）

An interesting article on encodings in Ruby: http://blog.grayproductions.net/articles/bytes_and_characters_in_ruby_18 (it's part of a series - check the table of contents at the start of the article also)

我以前没有使用汉字，但这似乎是unicode支持的列表： http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs 。还要注意，它是一个统一的系统，包括日文和韩文字符（一些字符在他们之间共享） - 不知道你是否可以区分哪些是中国人。

I haven't used chinese characters before but this seems to be the list supported by unicode: http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs . Also take note that it's a unified system including Japanese and Korean characters (some characters are shared between them) - not sure if you can distinguish which are Chinese only.

我认为您可以通过在字符串str和带有索引n的字符中调用它来检查它是否为CJK字符：

I think you can check if it's a CJK character by calling this on string str and character with index n:

def check_char(str, n)
  list_of_chars = str.unpack("U*")
  char = list_of_chars[n]
  #main blocks
  if char >= 0x4E00 && char <= 0x9FFF
    return true
  end
  #extended block A
  if char >= 0x3400 && char <= 0x4DBF
    return true
  end
  #extended block B
  if char >= 0x20000 && char <= 0x2A6DF
    return true
  end
  #extended block C
  if char >= 0x2A700 && char <= 0x2B73F
    return true
  end
  return false
end

上一篇 : ：使用C ++连接数据并将其插入MS Access表下一篇 : Linux Mint mysql-server和mysql-workbench安装和设置问题

如何确定字符是否是汉字

相关阅读

技术问答最新文章