且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何生成嵌套在多列中的数据滞后?

更新时间:2023-02-02 23:02:42

更新后的答案

根据您的评论,我重构了具有3个时间点,2个地区和3个年龄段的最小数据集 df .

set.seed(1234)
time.number = 3
region.number = 2
age.number = 3
total.number = time.number * region.number * age.number
df <-
  data.frame(
    Time = rep(1:time.number, each = region.number * age.number),
    Region = rep(LETTERS[1:region.number], each = age.number),
    Age = rep(seq(1, age.number), region.number),
    No_Persons = round(rnorm(total.number, mean = 10), 0)
  )
df

以下解决方案也应应用于您的真实数据.

The following solution should also applied to your real data.

library(data.table)
library(magrittr)
# set df as data.table
setDT(df)

# calculate the number from real data
age.number <- df[,Age] %>% unique() %>% length()
region.number <- df[,Region] %>% unique() %>% length()

df[,.(V1=.SD[1:age.number-1,No_Persons],
      V2=.SD[2:age.number,No_Persons]),
   by = .(Time,Region)][,Radio:=V2/lag(V1,region.number)][]

结果:

   Time Region V1 V2    Radio
 1:    1      A  9 10       NA
 2:    1      A 10 11       NA
 3:    1      B  8 10 1.111111
 4:    1      B 10 11 1.100000
 5:    2      A  9  9 1.125000
 6:    2      A  9  9 0.900000
 7:    2      B  9 10 1.111111
 8:    2      B 10  9 1.000000
 9:    3      A  9 10 1.111111
10:    3      A 10 11 1.100000
11:    3      B 10  9 1.000000
12:    3      B  9  9 0.900000

上一个答案

我不确定这是否是您想要的结果,但它确实可以获得正确的结果.

Previous Answer

I'm not sure if this is the result you want, but it can really get the right results.

library(data.table)
setDT(df)[,.(V1 = No_Persons[seq(1,.N,2)],
             V2 = No_Persons[seq(2,.N,2)]
            ),
          by = .(Time,Region)][,Radio:=V2/lag(V1,2)]