且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何从嵌套的json中提取字段并保存在数据结构中

更新时间:2023-11-23 08:50:52

我修改了您的函数以捕获字典列表.该词典将只包含在字段列表中指定的字段作为键.

I've modified your function to capture the list of dictionaries. The dictionary will only contain the fields specified in the fields list as keys.


import pandas as pd


def flatten_json(nested_json, fields):
    out = []
    temp = {}

    def flatten(x, name=''):
        nonlocal temp
        if type(x) is dict:
            temp = {}
            for a in x:
                flatten(x[a], a)
        elif type(x) is list:
            for i, a in enumerate(x):
                flatten(a)
                i += 1
        elif name in fields:
            temp[name] = x
            out.append(temp)
    flatten(nested_json)
    return out


json1 = {"employees": [{"first": "Alice", "last_name": "Alast", "zipcode": "12345", "role": "dev", "nbr": 1, "team": [{"first_name": "fn", "last_name": "ln"}, {
    "first_name": "fn2", "last_name": "ln2"}]}, {"name": "Bob", "role": "dev", "nbr": 2}], "firm": {"last_name": "Lhans", "zipcode": "67890", "location": "CA"}}

fields = ['first_name', 'last_name', 'zipcode']
result = (flatten_json(json1, fields))

然后可以将上述函数的输出加载到pandas数据框中-

The output of the above function can then be loaded into pandas dataframe -

df = pd.DataFrame(result)
df.drop_duplicates(inplace=True)
print(df)

这将给出这样的输出-

  last_name zipcode first_name
0     Alast   12345        NaN
2        ln     NaN         fn
4       ln2     NaN        fn2
6     Lhans   67890        NaN

现在,要以JSON格式获取数据,您可以使用to_dict()函数将数据框转换回dict-

Now, to get the data back in JSON format you can convert the dataframe back to dict using to_dict() function -

print(df.to_dict(orient='records'))

输出-

[{'first_name': nan, 'last_name': 'Alast', 'zipcode': '12345'},
 {'first_name': 'fn', 'last_name': 'ln', 'zipcode': nan},
 {'first_name': 'fn2', 'last_name': 'ln2', 'zipcode': nan},
 {'first_name': nan, 'last_name': 'Lhans', 'zipcode': '67890'}]