且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在Python中将JIS X 208代码转换为UTF-8

更新时间:2023-11-27 21:47:58

可以通过ISO 2022编解码器进行访问.

These are accessed under ISO 2022 codec.

>>> '亜'.encode('iso2022_jp')
b'\x1b$B0!\x1b(B'

如果我看到这些字节没有被转义序列限制,那么我将不得不知道正在使用哪个版本的JIS X 0208,但是无论如何我现在还是完全在Wikipedia上进行模式匹配.

If I saw those bytes not framed by the escape sequence, I would have to know which version of JIS X 0208 is being used, but I'm entirely pattern matching on Wikipedia at this point anyway.

>>> b = b'\033$B' + bytes.fromhex('3021')
>>> c = b.decode('iso2022_jp')
>>> c
'亜'
>>> urllib.parse.quote(c)
'%E4%BA%9C'

(这是Python 3.)

(This is Python 3.)