且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

根据其他列创建新的data.table列

更新时间:2023-12-01 19:44:40

我将使用两个 update joins :

library(data.table)
# aggregate coordinates
cols <- c("long", "lat")
agg_coord <- coord_data[, lapply(.SD, mean), .SDcols = cols, by = .(region, subregion)]
# coerce to data.table by reference
setDT(dt)[
  # 1st update join to append region/state.name
  .(state = state.abb, state.name = tolower(state.name)), 
  on = "state", region := state.name][
    # append subregion
    , subregion := tolower(county)][
      # 2nd update join to append coordinates
      agg_coord, on = .(region, subregion), (cols) := .(long, lat)][
        # remove helper columns
        , c("region", "subregion") := NULL]
# print updated dt
dt[]

    state       county prime_mover       long      lat
 1:    AZ     Maricopa          GT -111.88668 33.58126
 2:    AZ     Maricopa          GT -111.88668 33.58126
 3:    CA  Los Angeles          CT -118.29410 34.06683
 4:    CA       Orange          CT -117.73632 33.69611
 5:    CA  Los Angeles          CT -118.29410 34.06683
 6:    CT    Fairfield          CT  -73.35118 41.29633
 7:    FL Hillsborough          GT  -82.47527 27.87826
 8:    IN       Morgan          CT  -86.49791 39.52721
 9:    MA   Barnstable          GT  -70.21598 41.79520
10:    MA    Nantucket          GT  -70.05841 41.29880
11:    MA        Essex          GT  -70.98384 42.64042
12:    MN       Dakota          GT  -93.04962 44.70344
13:    NJ     Cape May          CT  -74.80790 39.15476
14:    NJ        Salem          GT  -75.36532 39.58720
15:    NJ    Middlesex          CT  -74.42345 40.45429
16:    NY        Kings          GT  -73.95052 40.64792
17:    NC     Buncombe          CT  -82.50883 35.62002
18:    SC     Anderson          CT  -82.61956 34.57094
19:    TN       Shelby          CT  -89.99297 35.22379
20:    TX      Tarrant          CT  -97.29396 32.79856