且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用python扁平化JSON数组?

更新时间:2022-06-10 14:57:18

使用 pandas & json_normalize :

  • record_path是主要key要展平的参数
  • meta是用于附加keys进行展平的参数
  • json_normalize创建的列名包括所有keys到所需的key,因此长列名(例如RoomRateDetailsList.RoomRateDetails.roomTypeCode)
  • 长列名称需要重命名为较短的版本
  • dict理解用于创建rename dict.
  • 以下代码利用了 pathlib
  • .openpathlib
  • 的方法
  • 也适用于非Windows路径
  • Use pandas & json_normalize:

    • record_path is the parameter for the main key to flatten
    • meta is the parameter for additional keys to flatten
    • json_normalize creates column names that include all keys to the desired key, hence the long column names (e.g. RoomRateDetailsList.RoomRateDetails.roomTypeCode)
    • Long column names need to be renamed to shorter versions
    • A dict comprehension is used to create a rename dict.
    • The following code takes advantage of the pathlib library
    • .open is a method of pathlib
    • Works with non-Windows paths too
import pandas as pd
import json
from pathlib import Path


# path to file
p = Path(r'c:\some_path_to_file\test.json')

# read json file
with p.open('r', encoding='utf-8') as f:
    data = json.loads(f.read())

# create dataframe
df = pd.json_normalize(data,
                    record_path=['data1', 'HotelListResponse', 'HotelList', 'HotelSummary'],
                    meta=[['data1', 'HotelListResponse', 'customerSessionId'],
                          ['data1', 'HotelListResponse', 'numberOfRoomsRequested'],
                          ['data1', 'HotelListResponse', 'moreResultsAvailable'],
                          ['data1', 'HotelListResponse', 'cacheKey'],
                          ['data1', 'HotelListResponse', 'cacheLocation'],
                          ['data1', 'HotelListResponse', 'HotelList', '@size'],
                          ['data1', 'HotelListResponse', 'HotelList', '@activePropertyCount']])

# rename columns:
rename = {value: value.split('.')[-1].replace('@', '') for value in df.columns}
df.rename(columns=rename, inplace=True)

# dataframe view
 hotelId       city  highRate  lowRate rateCurrencyCode  roomTypeCode  rateCode             roomDescription  promo  averageBaseRate  averageRate currencyCode  nightlyRateTotal  surchargeTotal   total                     customerSessionId numberOfRoomsRequested moreResultsAvailable                     cacheKey       cacheLocation size activePropertyCount
  132684    Seattle     159.0    159.0              USD         10351     10351  Standard Room, 1 Queen Bed  False            159.0        159.0          USD             159.0           26.81  185.81  0AB29024-F6D4-3915-0862-DB3FD1904C5A                      1                 True  -705f6d43:15086db3fd1:-4c58  10.178.144.36:7300    2                   2
  263664  Las Vegas     135.0     94.5              USD        373685   1238953   Standard Room, 1 King Bed   True            135.0         94.5          USD              94.5            9.45  103.95  0AB29024-F6D4-3915-0862-DB3FD1904C5A                      1                 True  -705f6d43:15086db3fd1:-4c58  10.178.144.36:7300    2                   2

# save to JSON
df.to_json('out.json', orient='records')

最终JSON输出:

[{
        "hotelId": 132684,
        "city": "Seattle",
        "highRate": 159.0,
        "lowRate": 159.0,
        "rateCurrencyCode": "USD",
        "roomTypeCode": 10351,
        "rateCode": 10351,
        "roomDescription": "Standard Room, 1 Queen Bed",
        "promo": false,
        "averageBaseRate": 159.0,
        "averageRate": 159.0,
        "currencyCode": "USD",
        "nightlyRateTotal": 159.0,
        "surchargeTotal": 26.81,
        "total": 185.81,
        "customerSessionId": "0AB29024-F6D4-3915-0862-DB3FD1904C5A",
        "numberOfRoomsRequested": 1,
        "moreResultsAvailable": true,
        "cacheKey": "-705f6d43:15086db3fd1:-4c58",
        "cacheLocation": "10.178.144.36:7300",
        "size": 2,
        "activePropertyCount": 2
    }, {
        "hotelId": 263664,
        "city": "Las Vegas",
        "highRate": 135.0,
        "lowRate": 94.5,
        "rateCurrencyCode": "USD",
        "roomTypeCode": 373685,
        "rateCode": 1238953,
        "roomDescription": "Standard Room, 1 King Bed",
        "promo": true,
        "averageBaseRate": 135.0,
        "averageRate": 94.5,
        "currencyCode": "USD",
        "nightlyRateTotal": 94.5,
        "surchargeTotal": 9.45,
        "total": 103.95,
        "customerSessionId": "0AB29024-F6D4-3915-0862-DB3FD1904C5A",
        "numberOfRoomsRequested": 1,
        "moreResultsAvailable": true,
        "cacheKey": "-705f6d43:15086db3fd1:-4c58",
        "cacheLocation": "10.178.144.36:7300",
        "size": 2,
        "activePropertyCount": 2
    }
]