且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

调试功能:为多列创建多个滞后(dplyr)

更新时间:2023-02-02 22:46:35

我们可以使用 shift from data.table 可以为'n'采取多个值

We can use shift from data.table which can take multiple values for 'n'

library(data.table)
setDT(df)[order(time), c("a", "b", "c") := shift(x, 1:3) , id][order(id, time)]

假设我们需要在多个列上执行

Suppose, we need to do this on multiple columns

df$y <- df$x
setDT(df)[order(time), paste0(rep(c("x", "y"), each =3), 
                c("a", "b", "c")) :=shift(.SD, 1:3), id, .SDcols = x:y]






shift 也可以在 dplyr中使用

library(dplyr)
df %>% 
  group_by(id) %>% 
  arrange(id, time) %>% 
  do(data.frame(., setNames(shift(.$x, 1:3), c("a", "b", "c"))))
#    id  time     x     a     b     c
#   <dbl> <int> <int> <int> <int> <int>
#1      1  2000     1    NA    NA    NA
#2      1  2001     2     1    NA    NA
#3      1  2002     3     2     1    NA
#4      1  2003     4     3     2     1
#5      1  2004     5     4     3     2
#6      1  2005     6     5     4     3
#7      1  2006     7     6     5     4
#8      1  2007     8     7     6     5
#9      1  2008     9     8     7     6
#10     1  2009    10     9     8     7
#11     2  2000    10    NA    NA    NA
#12     2  2001    11    10    NA    NA
#13     2  2002    12    11    10    NA
#14     2  2003    13    12    11    10
#15     2  2004    14    13    12    11
#16     2  2005    15    14    13    12
#17     2  2006    16    15    14    13
#18     2  2007    17    16    15    14
#19     2  2008    18    17    16    15
#20     2  2009    19    18    17    16