更新时间:2023-02-02 23:02:42
根据您的评论,我重构了具有3个时间点,2个地区和3个年龄段的最小数据集 df
.
set.seed(1234)
time.number = 3
region.number = 2
age.number = 3
total.number = time.number * region.number * age.number
df <-
data.frame(
Time = rep(1:time.number, each = region.number * age.number),
Region = rep(LETTERS[1:region.number], each = age.number),
Age = rep(seq(1, age.number), region.number),
No_Persons = round(rnorm(total.number, mean = 10), 0)
)
df
以下解决方案也应应用于您的真实数据.
The following solution should also applied to your real data.
library(data.table)
library(magrittr)
# set df as data.table
setDT(df)
# calculate the number from real data
age.number <- df[,Age] %>% unique() %>% length()
region.number <- df[,Region] %>% unique() %>% length()
df[,.(V1=.SD[1:age.number-1,No_Persons],
V2=.SD[2:age.number,No_Persons]),
by = .(Time,Region)][,Radio:=V2/lag(V1,region.number)][]
结果:
Time Region V1 V2 Radio
1: 1 A 9 10 NA
2: 1 A 10 11 NA
3: 1 B 8 10 1.111111
4: 1 B 10 11 1.100000
5: 2 A 9 9 1.125000
6: 2 A 9 9 0.900000
7: 2 B 9 10 1.111111
8: 2 B 10 9 1.000000
9: 3 A 9 10 1.111111
10: 3 A 10 11 1.100000
11: 3 B 10 9 1.000000
12: 3 B 9 9 0.900000
我不确定这是否是您想要的结果,但它确实可以获得正确的结果.
I'm not sure if this is the result you want, but it can really get the right results.
library(data.table)
setDT(df)[,.(V1 = No_Persons[seq(1,.N,2)],
V2 = No_Persons[seq(2,.N,2)]
),
by = .(Time,Region)][,Radio:=V2/lag(V1,2)]