且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

QString到Unicode std :: string

更新时间:2023-11-14 19:43:46

以下内容适用于Qt5.Qt4的行为是不同的,实际上是不正确的.

您需要选择:

  1. 是否要使用8位宽的 std :: string 或16位宽的 std :: wstring 或其他某种类型.

  1. Whether you want the 8-bit wide std::string or 16-bit wide std::wstring, or some other type.

目标字符串中需要哪种编码?

What encoding is desired in your target string?

在内部, QString 存储UTF-16编码的数据,因此任何Unicode代码点都可以用一个或两个 QChar 表示.

Internally, QString stores UTF-16 encoded data, so any Unicode code point may be represented in one or two QChars.

常见案例:

  • 本地编码的8位 std :: string (如:系统区域设置):

  • Locally encoded 8-bit std::string (as in: system locale):

std::string(str.toLocal8Bit().constData())

  • UTF-8编码的8位 std :: string :

    str.toStdString()
    

    这等效于:

    std::string(str.toUtf8().constData())
    

  • UTF-16或UCS-4编码的 std :: wstring ,分别为16或32位宽.Qt选择16位和32位编码,以匹配平台的 wchar_t 宽度.

  • UTF-16 or UCS-4 encoded std::wstring, 16- or 32 bits wide, respectively. The selection of 16- vs. 32-bit encoding is done by Qt to match the platform's width of wchar_t.

    str.toStdWString()
    

  • C ++ 11的U16或U32字符串-从Qt 5.5起:

  • U16 or U32 strings of C++11 - from Qt 5.5 onwards:

    str.toStdU16String()
    str.toStdU32String()
    

  • UTF-16编码的16位 std :: u16string -仅在Qt 5.4之前需要此hack:

  • UTF-16 encoded 16-bit std::u16string - this hack is only needed up to Qt 5.4:

    std::u16string(reinterpret_cast<const char16_t*>(str.constData()))
    

    此编码不包含字节顺序标记(BOM).

    This encoding does not include byte order marks (BOMs).

    在转换之前将BOM预先添加到 QString 本身很容易:

    It's easy to prepend BOMs to the QString itself before converting it:

    QString src = ...;
    src.prepend(QChar::ByteOrderMark);
    #if QT_VERSION < QT_VERSION_CHECK(5,5,0)
    auto dst = std::u16string{reinterpret_cast<const char16_t*>(src.constData()),
                              src.size()};
    #else
    auto dst = src.toStdU16String();
    

    如果您希望字符串很大,则可以跳过一个副本:

    If you expect the strings to be large, you can skip one copy:

    const QString src = ...;
    std::u16string dst;
    dst.reserve(src.size() + 2); // BOM + termination
    dst.append(char16_t(QChar::ByteOrderMark));
    dst.append(reinterpret_cast<const char16_t*>(src.constData()),
               src.size()+1);
    

    在两种情况下, dst 现在都可以移植到具有任意字节序的系统.

    In both cases, dst is now portable to systems with either endianness.