且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何将字典转换为Unicode JSON字符串?

更新时间:2023-02-14 11:13:23

要求

  • 确保您的python文件使用UTF-8编码.否则,您的非ASCII字符将成为问号?. Notepad ++为此提供了出色的编码选项.

    Requirements

    • Make sure your python files are encoded in UTF-8. Or else your non-ascii characters will become question marks, ?. Notepad++ has excellent encoding options for this.

      确保已包含适当的字体.如果要显示日语字符,则需要安装日语字体.

      Make sure that you have the appropriate fonts included. If you want to display Japanese characters then you need to install Japanese fonts.

      确保您的IDE支持显示unicode字符. 否则,您可能会抛出UnicodeEncodeError错误.

      Make sure that your IDE supports displaying unicode characters. Otherwise you might get an UnicodeEncodeError error thrown.

      示例:

      UnicodeEncodeError: 'charmap' codec can't encode characters in position 22-23: character maps to <undefined>
      

      PyScripter为我工作.它包含在 http://portablepython.com/wiki/PortablePython3.2.1.1的"Portable Python"中

      PyScripter works for me. It's included with "Portable Python" at http://portablepython.com/wiki/PortablePython3.2.1.1

      • 请确保您使用的是Python 3+,因为此版本提供了更好的unicode支持.

      json.dumps()转义unicode字符.

      json.dumps() escapes unicode characters.

      阅读底部的更新.或者...

      Read the update at the bottom. Or...

      将每个转义的字符替换为解析的unicode字符.

      Replace each escaped characters with the parsed unicode character.

      我创建了一个简单的名为getStringWithDecodedUnicode的lambda函数.

      I created a simple lambda function called getStringWithDecodedUnicode that does just that.

      import re   
      getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )
      

      这是getStringWithDecodedUnicode作为常规函数.

      def getStringWithDecodedUnicode( value ):
          findUnicodeRE = re.compile( '\\\\u([\da-f]{4})' )
          def getParsedUnicode(x):
              return chr( int( x.group(1), 16 ) )
      
          return  findUnicodeRE.sub(getParsedUnicode, str( value ) )
      

      示例

      testJSONWithUnicode.py(使用PyScripter作为IDE)

      import re
      import json
      getStringWithDecodedUnicode = lambda str : re.sub( '\\\\u([\da-f]{4})', (lambda x : chr( int( x.group(1), 16 ) )), str )
      
      data = {"Japan":"日本"}
      jsonString = json.dumps( data )
      print( "json.dumps({0}) = {1}".format( data, jsonString ) )
      jsonString = getStringWithDecodedUnicode( jsonString )
      print( "Decoded Unicode: %s" % jsonString )
      

      输出

      json.dumps({'Japan': '日本'}) = {"Japan": "\u65e5\u672c"}
      Decoded Unicode: {"Japan": "日本"}
      

      更新

      或者...只需将ensure_ascii=False作为json.dumps的选项传递即可.

      Update

      Or... just pass ensure_ascii=False as an option for json.dumps.

      注意:您需要满足我一开始所概述的要求,否则将无法正常工作.

      Note: You need to meet the requirements that I outlined at the beginning or else this isn't going to work.

      import json
      data = {'navn': 'Åge', 'stilling': 'Lærling'}
      result = json.dumps(d, ensure_ascii=False)
      print( result ) # prints '{"stilling": "Lærling", "navn": "Åge"}'