且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在UTF-8设置不能使用的源代码中的汉字

更新时间:2023-01-30 21:16:22

c $ c>你好是一个Unicode字符序列。您将需要使用Unicode字符集,以确保它将正确显示。

The fact is, 你好 is a sequence of Unicode characters. You will need to use a Unicode character set in order to ensure that it will be displayed correctly.

唯一可能的例外是,如果您使用多字符,字节字符集,其包括基本字符集中的这两个字符。因为你说你仍然无法编译MBCS,这可能是一个解决方案。为了使其工作,您必须将系统语言设置为包含此字符的语言。在每个操作系统版本中,这样做的确切方式会发生变化。我认为他们正试图改善它。在Windows 7上,至少,他们将此称为非Unicode程序的语言设置,可在地区和语言控制面板中访问。

The only possible exception to that is if you're using a multi-byte character set that includes both of these characters in the basic character set. Since you say that you're stuck compiling for the MBCS anyway, that might be a solution. In order to make it work, you will have to set the system language to one that includes this character. The exact way you do this changes in each OS version. I think they're trying to "improve" it. On Windows 7, at least, they call this the "Language for non-Unicode programs" setting, accessible in the "Regions and Language" control panel.

没有系统语言,其中这些字符被提供作为基本字符集的一部分,那么你基本上是运气。

If there is no system language in which these characters are provided as part of the basic character set, then you are basically out of luck.

即使你试图使用UTF-8编码(Windows不是原生支持,而是喜欢使用UTF-16支持Unicode),它使用 char 数据类型,很可能无论其他应用程序/你的接口将无法处理它。 Windows应用程序假定 char 在当前的ANSI / MB字符集中保存一个字符。 Unicode字符在 wchar_t 中,并且由于您不能使用它,它表示应用程序根本不支持Unicode。 (这意味着它已损坏,升级的时候。)

Even if you tried to use a UTF-8 encoding (which Windows does not natively support, instead preferring UTF-16 for its Unicode support), which uses the char data type, it is very likely that whatever other application/library you're interfacing with would not be able to deal with it. Windows applications assume that a char holds a character in the current ANSI/MB character set. Unicode characters are in a wchar_t, and since you can't use that, it indicates the application simply doesn't support Unicode. (That means it's broken, by the way—time to upgrade.)