更新时间:2023-11-23 08:50:52
我修改了您的函数以捕获字典列表.该词典将只包含在字段列表中指定的字段作为键.
I've modified your function to capture the list of dictionaries. The dictionary will only contain the fields specified in the fields list as keys.
import pandas as pd
def flatten_json(nested_json, fields):
out = []
temp = {}
def flatten(x, name=''):
nonlocal temp
if type(x) is dict:
temp = {}
for a in x:
flatten(x[a], a)
elif type(x) is list:
for i, a in enumerate(x):
flatten(a)
i += 1
elif name in fields:
temp[name] = x
out.append(temp)
flatten(nested_json)
return out
json1 = {"employees": [{"first": "Alice", "last_name": "Alast", "zipcode": "12345", "role": "dev", "nbr": 1, "team": [{"first_name": "fn", "last_name": "ln"}, {
"first_name": "fn2", "last_name": "ln2"}]}, {"name": "Bob", "role": "dev", "nbr": 2}], "firm": {"last_name": "Lhans", "zipcode": "67890", "location": "CA"}}
fields = ['first_name', 'last_name', 'zipcode']
result = (flatten_json(json1, fields))
然后可以将上述函数的输出加载到pandas数据框中-
The output of the above function can then be loaded into pandas dataframe -
df = pd.DataFrame(result)
df.drop_duplicates(inplace=True)
print(df)
这将给出这样的输出-
last_name zipcode first_name
0 Alast 12345 NaN
2 ln NaN fn
4 ln2 NaN fn2
6 Lhans 67890 NaN
现在,要以JSON格式获取数据,您可以使用to_dict()函数将数据框转换回dict-
Now, to get the data back in JSON format you can convert the dataframe back to dict using to_dict() function -
print(df.to_dict(orient='records'))
输出-
[{'first_name': nan, 'last_name': 'Alast', 'zipcode': '12345'},
{'first_name': 'fn', 'last_name': 'ln', 'zipcode': nan},
{'first_name': 'fn2', 'last_name': 'ln2', 'zipcode': nan},
{'first_name': nan, 'last_name': 'Lhans', 'zipcode': '67890'}]