且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Visual C ++:将传统的C和C ++字符串代码迁移到Unicode世界

更新时间:2022-11-09 12:35:18

> L, _T() std :: wstring 不是多平台)和Microsoft关于如何执行Unicode的建议。



这个主题有很多混乱。有些人还是认为Unicode == 2字节字符== UTF-16。

其实,可能,甚至更好的保持char *和普通的 std :: string ,纯文本和更改很少(仍然完全支持Unicode!)。

=http://***.com/questions/1049947/should-utf16-be-considered-harmful/1855375#1855375> http://***.com/questions/1049947/should-utf-16-be -considered-harmful / 1855375#1855375 如何做到最简单(在我看来)的方式。


I see that Visual Studio 2008 and later now start off a new solution with the Character Set set to Unicode. My old C++ code deals with only English ASCII text and is full of:

  • Literal strings like "Hello World"
  • char type
  • char * pointers to allocated C strings
  • STL string type
  • Conversions from STL string to C string and vice versa using STL string constructor (which accepts const char *) and STL string.c_str()

    1. What are the changes I need to make to migrate this code so that it works in an ecosystem of Visual Studio Unicode and Unicode enabled libraries? (I have no real need for it work with both ASCII and Unicode, it can be pure Unicode.)

    2. Is it also possible to do this in a platform independent way? (i.e., by not using Microsoft types.)

I see so many wide character and Unicode types and conversions scattered around, hence my confusion. (Ex: wchar_t, TCHAR, _T, _TEXT, TEXT etc.)

I recommend very much against L"", _T(), std::wstring (the latter is not multiplatform) and Microsoft recommendations on how to do Unicode.

There's a lot of confusion on this subject. Some people still think Unicode == 2 byte characters == UTF-16. Neither equality is correct.

In fact, it's possible, and even better to stay with char* and the plain std::string, plain literals and change very little (and still fully support Unicode!).

See my answer here: http://***.com/questions/1049947/should-utf-16-be-considered-harmful/1855375#1855375 for how to do it the easiest (in my opinion) way.