且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Python json解析器允许重复键

更新时间:2023-01-17 12:01:07

您可以使用 JSONDecoder.object_pairs_hook 自定义如何JSONDecoder 解码对象.这个钩子函数将传递一个 (key, value) 对的列表,你通常会对其进行一些处理,然后变成一个 dict.

You can use JSONDecoder.object_pairs_hook to customize how JSONDecoder decodes objects. This hook function will be passed a list of (key, value) pairs that you usually do some processing on, and then turn into a dict.

但是,由于 Python 字典不允许重复键(而且您根本无法更改它),您可以在钩子中返回未更改的对并获得 (key, value) 对:

However, since Python dictionaries don't allow for duplicate keys (and you simply can't change that), you can return the pairs unchanged in the hook and get a nested list of (key, value) pairs when you decode your JSON:

from json import JSONDecoder

def parse_object_pairs(pairs):
    return pairs


data = """
{"foo": {"baz": 42}, "foo": 7}
"""

decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj

输出:

[(u'foo', [(u'baz', 42)]), (u'foo', 7)]

如何使用此数据结构取决于您.如上所述,Python 字典不允许重复键,而且没有办法解决这个问题.您甚至会如何根据键进行查找?dct[key] 会产生歧义.

How you use this data structure is up to you. As stated above, Python dictionaries won't allow for duplicate keys, and there's no way around that. How would you even do a lookup based on a key? dct[key] would be ambiguous.

因此,您可以实现自己的逻辑来以您期望的方式处理查找,或者实现某种冲突避免以使键唯一(如果它们不是),然后然后创建嵌套列表中的字典.

So you can either implement your own logic to handle a lookup the way you expect it to work, or implement some sort of collision avoidance to make keys unique if they're not, and then create a dictionary from your nested list.

编辑:既然您说过要修改重复键以使其独一无二,那么您可以这样做:

Edit: Since you said you would like to modify the duplicate key to make it unique, here's how you'd do that:

from collections import OrderedDict
from json import JSONDecoder


def make_unique(key, dct):
    counter = 0
    unique_key = key

    while unique_key in dct:
        counter += 1
        unique_key = '{}_{}'.format(key, counter)
    return unique_key


def parse_object_pairs(pairs):
    dct = OrderedDict()
    for key, value in pairs:
        if key in dct:
            key = make_unique(key, dct)
        dct[key] = value

    return dct


data = """
{"foo": {"baz": 42, "baz": 77}, "foo": 7, "foo": 23}
"""

decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj

输出:

OrderedDict([(u'foo', OrderedDict([(u'baz', 42), ('baz_1', 77)])), ('foo_1', 7), ('foo_2', 23)])

make_unique 函数负责返回一个无冲突的密钥.在这个例子中,它只是用 _n 给键添加后缀,其中 n 是一个增量计数器 - 只需根据您的需要进行调整即可.

The make_unique function is responsible for returning a collision-free key. In this example it just suffixes the key with _n where n is an incremental counter - just adapt it to your needs.

因为 object_pairs_hook 完全按照它们在 JSON 文档中出现的顺序接收对,所以也可以使用 OrderedDict 保留该顺序,我将其包含为嗯.

Because the object_pairs_hook receives the pairs exactly in the order they appear in the JSON document, it's also possible to preserve that order by using an OrderedDict, I included that as well.