如何在python中解码从文件读取的unicode字符串?

更新时间：2023-11-15 16:31:10

看起来文件是通过向其写入字节文字而创建的，如下所示:

It looks like the file has been created by writing bytes literals to it, something like this:

some_bytes = b'Hello world'
with open('myfile.txt', 'w') as f:
    f.write(str(some_bytes))

这可以避免以下事实:尝试向以文本模式打开的文件写入字节会引发错误，但代价是该文件现在包含"b'hello world'" (注意引号内的"b".

This gets around the fact that attempting write bytes to a file opened in text mode raises an error, but at the cost that the file now contains "b'hello world'" (note the 'b' inside the quotes).

解决方案是在写入之前将 bytes 解码为 str :

The solution is to decode the bytes to str before writing:

some_bytes = b'Hello world'
my_str = some_bytes.decode('utf-16') # or whatever the encoding of the bytes might be
with open('myfile.txt', 'w') as f:
    f.write(my_str)

或以二进制模式打开文件并直接写入字节

or open the file in binary mode and write the bytes directly

some_bytes = b'Hello world'
with open('myfile.txt', 'wb') as f:
    f.write(some_bytes)

请注意，如果以文本模式打开文件，则需要提供正确的编码

Note you will need to provide the correct encoding if opening the file in text mode

with open('myfile.txt', encoding='utf-16') as f:  # Be sure to use the correct encoding

考虑将运行Python的 -b 或 -bb 标志设置为分别发出警告或异常以检测对字节进行字符串化的尝试.

Consider running Python with the -b or -bb flag set to raise a warning or exception respectively to detect attempts to stringify bytes.

上一篇 : ：如何在 Go 中生成固定长度的随机字符串?下一篇 : db2 for i:在 in 子句中的 sql 过程中传递包含逗号分隔字符串的 varchar

如何在python中解码从文件读取的unicode字符串?

相关阅读

推荐文章