且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

什么是字符串的C和C ++的类型?

更新时间:2023-11-13 08:18:34

在C字符串文字的类型是的char [] - 这不是常量根据类型>,但它是不确定的行为要修改的内容。此外,具有相同内容(或足够相同内容的)2个不同的字符串可能或不可能共享相同的数组元素。

In C the type of a string literal is a char[] - it's not const according to the type, but it is undefined behavior to modify the contents. Also, 2 different string literals that have the same content (or enough of the same content) might or might not share the same array elements.

从C99标准6.4.5 / 5字符串文字 - 语义

From the C99 standard 6.4.5/5 "String Literals - Semantics":

在翻译阶段7,零值字节或code追加到从字符串字面量或导致每个多字节字符序列。然后该多字节字符序列用于初始化静态存储持续时间和长度刚好足以包含序列的阵列。对于字符串常量,数组元素的类型为字符,并与多字节字符序列的各个字节被初始化;宽字符串,数组元素的类型为 wchar_t的,并与宽字符序列...

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

这是不确定的,这些阵列是否提供了不同的元素具有适当的值。如果程序试图修改这样的阵列,其行为是不确定的。

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

在C ++中,一个普通的字符串文字的类型是'N阵列为const char '(从2.13.4 / 1字符串文字)。但是,有一个在C ++标准的特殊情况,使指针指向字符串很容易地转换到非const限定指针(4.2 / 2数组到指针的转换):

In C++, "An ordinary string literal has type 'array of n const char'" (from 2.13.4/1 "String literals"). But there's a special case in the C++ standard that makes pointer to string literals convert easily to non-const-qualified pointers (4.2/2 "Array-to-pointer conversion"):

一个字符串(2.13.4)这不是一个宽字符串可以转换为类型字符指针的右值;宽字符串可以转换为类型的右值指针wchar_t的。

A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type "pointer to char"; a wide string literal can be converted to an rvalue of type "pointer to wchar_t".

作为一个方面说明 - 因为在C / C ++的数组如此轻易转换为指针,一个字符串文字通常可以在一个指针上下文中使用,就像在C / C ++的数组

As a side note - because arrays in C/C++ convert so readily to pointers, a string literal can often be used in a pointer context, much as any array in C/C++.

附加编者按:接下来是真正对我的关于对字符串类型所做的选择C和C ++标准的基本原理部分主要是投机。因此,采取与一粒盐(但请发表评论,如果您有更正或其他详细信息):

Additional editorializing: what follows is really mostly speculation on my part about the rationale for the choices the C and C++ standards made regarding string literal types. So take it with a grain of salt (but please comment if you have corrections or additional details):

我认为C标准选择让字符串字面量非const类型,因为那里是(现在也是),这么多code,它希望能够使用非const限定字符指向文字指针。当得到了添加了常量预选赛(它,如果我没有记错的话是围绕ANSI标准化的时间,但K&放很久后才出现的; RC已经围绕积累一吨的现有$的C $ c)若他们提出指向字符串只能被分配到字符常量* 类型,而铸造几乎存在每一个程序将需要改变。不要让一个标准接受的好办法...

I think that the C standard chose to make string literal non-const types because there was (and is) so much code that expects to be able to use non-const-qualified char pointers that point to literals. When the const qualifier got added (which if I'm not mistaken was done around ANSI standardization time, but long after K&R C had been around to accumulate a ton of existing code) if they made pointers to string literals only able to be be assigned to char const* types without a cast nearly every program in existence would have required changing. Not a good way to get a standard accepted...

我相信++变更为C的字符串文字常量资格主要是做支持,允许一个字符串更适当匹配的过载,需要一个字符常量* 的说法。我认为,也有在类型系统收感知孔的愿望,但孔阵列到指针的转换主要是打开备份的特殊情况。

I believe the change to C++ that string literals are const qualified was done mainly to support allowing a literal string to more appropriately match an overload that takes a "char const*" argument. I think that there was also a desire to close a perceived hole in the type system, but the hole was largely opened back up by the special case in array-to-pointer conversions.

标准的附录D指出,从常量到非const资格字符串(4.2)隐式转换pcated德$ P $,但我觉得这么多code仍然会打破它会是一个漫长的时间之前编译实施者或标准委员会愿意居然拔出插头。(除非其他一些聪明的技术可以设计 - 但随后的孔就会回来,不会吧)

Annex D of the standard indicates that the "implicit conversion from const to non-const qualification for string literals (4.2) is deprecated", but I think so much code would still break that it'll be a long time before compiler implementers or the standards committee are willing to actually pull the plug (unless some other clever technique can be devised - but then the hole would be back, wouldn't it?).