更新时间:2023-09-11 22:36:40
Java中没有诸如"ASCII字符串"或"UTF-8字符串"之类的东西.到您有一个String
对象时,它只是一个UTF-16代码单元的序列.没有记录它最初是使用ASCII还是使用UTF-8从字节数组中解码来解释字节的.
There's no such thing as an "ASCII string" or a "UTF-8 string" in Java. By the time you've got a String
object, it's just a sequence of UTF-16 code units. There's no record of whether it was originally decoded from a byte array using ASCII or UTF-8 to interpret the bytes.
还要注意,UTF-8与ASCII向后兼容,因为如果您有任何表示ASCII编码文本的有效字节序列,则相同字节序列应为用于表示UTF-8中的相同文本.
Also note that UTF-8 is backward-compatible with ASCII, in that if you've got any valid sequence of bytes representing ASCII-encoded text, that's the same sequence of bytes that would be used to represent the same text in UTF-8.