且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

R查找两个美国邮政编码列之间的距离

更新时间:2023-02-01 22:46:37

这里有一个方便的R包,名为"zipcode".其中提供了邮政编码,城市,州和纬度和经度的表格.因此,一旦获得了这些信息,地理圈"便会出现.包可以计算点之间的距离.

There is a handy R package out there named "zipcode" which provides a table of zip code, city, state and the latitude and longitude. So once you have that information, the "geosphere" package can calculate the distance between points.

library(zipcode)
library(geosphere)

#dataframe need to be character arrays or the else the leading zeros will be dropped causing errors
df <- data.frame("ZIP_START" = c(95051, 94534, 60193, 94591, 94128, 94015, 94553, 10994, 95008), 
       "ZIP_END" = c(98053, 94128, 60666, 73344, 94128, 73344, 94128, "07105", 94128), 
       stringsAsFactors = FALSE)

data("zipcode")

df$distance_meters<-apply(df, 1, function(x){
  startindex<-which(x[["ZIP_START"]]==zipcode$zip)
  endindex<-which(x[["ZIP_END"]]==zipcode$zip)
  distGeo(p1=c(zipcode[startindex, "longitude"], zipcode[startindex, "latitude"]), p2=c(zipcode[endindex, "longitude"], zipcode[endindex, "latitude"]))
})

关于输入数据框的列类的警告.邮政编码应为字符而不是数字,否则前导零将被丢弃,从而导致错误.

Warning about your column class for your input data frame. Zip codes should be a character and not numeric, otherwise leading zeros are dropped causing errors.

从distGeo返回的距离以米为单位,我将允许读者确定适当的单位转换为英里.

The return distance from distGeo is in meters, I will allow the reader to determine the proper unit conversion to miles.

更新
邮递区号档案似乎已被封存.有一个替换包:"zipcodeR"提供经度和纬度数据以及附加信息.

Update
The zipcode package appears to have been archived. There is a replacement package: "zipcodeR" which provides the longitude and latitude data along with addition information.