PHP htmlspecialchars函数中的Unicode替换字符

更新时间：2023-02-19 20:00:39

只有一个通用替换字符:U + FFFD.如果要写出UTF-8，则此代码点已正确编码.如果没有，您将获得相应的字符引用�.

There is only one, universal replacement character: U+FFFD. If you are writing out UTF-8, then this codepoint is appropriately encoded. If not, you get the corresponding character reference � instead.

没有可逆映射.根据定义，原始字节序列为无效，即它没有具有值(有效=具有值).

There is no reversible mapping. By definition, the original byte sequence was invalid, i.e. it does not have a value (valid = has a value).

替换的字节(不是真正的字符")是在假定的源编码中无效的字节.例如，如果您的源编码是UTF-16，并且您有一个单独的代理，那将是无效的"(尽管从技术上讲，任何文本处理器都应该在这种情况下致命地中止).更好的例子是，如果源编码是ASCII，则127以上的任何值都是无效字符.

Bytes (not really "characters") that are replaced are those that are not valid in the assumed source encoding. For example, if your source encoding was UTF-16 and you had a lone surrogate, that would be "invalid" (though technically any text processor is supposed to abort fatally in that situation). As a better example, if the source encoding is ASCII, then any value above 127 is an invalid character.

上一篇 : ：如何处理使用控制字符的控制台应用程序的输出？下一篇 : global.aspx Application_error不会触发除。之外的任何内容

PHP htmlspecialchars函数中的Unicode替换字符

相关阅读

技术问答最新文章