更新时间:2023-12-01 12:02:34
这是另一种方法是相当凌乱,但我认为你想要的:
Here is another approach that is quite messy but I think does what you want:
library(dplyr)
yesterday <- as.Date(Sys.Date()-1)
df2 <- employ.data %>% group_by(Child) %>%
mutate(`Consec. Days with Callout`=cumsum(rev(cumprod(rev((yesterday-DATE)==(which(DATE == yesterday)-row_number()) & Callout==Callout[DATE == yesterday]))))) %>%
filter(DATE == yesterday)
##Source: local data frame [3 x 7]
##Groups: Child [3]
##
## DATE Parent Child avg_child_salary salary Callout Consec. Days with Callout
## <date> <fctr> <fctr> <dbl> <dbl> <fctr> <dbl>
##1 2016-12-04 A ab 500 2000 HIGH 2
##2 2016-12-04 A ac 300 1234 HIGH 2
##3 2016-12-04 B bd 9000 600 LOW 1
注意:
(yesterday-DATE)==(which(DATE == yesterday)-row_number())& Callout == Callout [DATE == yesterday]
计算一个条件,如果$ code> Callout TRUE / code>与昨天
的标注
相同,如果行中的行距离是昨天
与日期的距离相同。这给出了 Cond
列,如下所示:
(yesterday-DATE)==(which(DATE == yesterday)-row_number()) & Callout==Callout[DATE == yesterday]
computes a condition that will be TRUE
for the row if the Callout
is the same as the Callout
for yesterday
and if the distance in rows from the row that is yesterday
is the same as the distance in days for the date. This gives the Cond
column as shown below:
Source: local data frame [8 x 7]
Groups: Child [3]
DATE Parent Child avg_child_salary salary Callout Cond
<date> <fctr> <fctr> <dbl> <dbl> <fctr> <lgl>
1 2016-12-01 A ab 500 1000 HIGH TRUE
2 2016-12-02 A ab 500 100 LOW FALSE
3 2016-12-03 A ab 500 4000 HIGH TRUE
4 2016-12-04 A ab 500 2000 HIGH TRUE
5 2016-12-01 A ac 300 1000 HIGH FALSE
6 2016-12-03 A ac 300 3455 HIGH TRUE
7 2016-12-04 A ac 300 1234 HIGH TRUE
8 2016-12-04 B bd 9000 600 LOW TRUE
鉴于此,我们希望从昨天行的连续
TRUE
的数量倒数c $ c>(按 Child
分组)。为了做到这一点,我们可以使用 rev
来逆转向量,执行一个 cumprod
,它将从 1
到 0
一旦遇到 FALSE
,反向向量再次使用 rev
,最后执行 cumsum
来累积连续的日子。这样做会给出以下的 Consec。标注日期
列解释为以前连续天数与标注
之间的日期昨天
:
Given this we want to count backwards the number of consecutive TRUE
from the row that is yesterday
(grouped by Child
). To do this, we can reverse the vector using rev
, do a cumprod
, which will switch from 1
to 0
as soon as it encounters a FALSE
, reverse the vector back again using rev
, and finally do the cumsum
to accumulate the consecutive days. Doing this gives the following where the Consec. Days with Callout
column is interpreted as the number of previous consecutive days with the same Callout
as yesterday
:
Source: local data frame [8 x 7]
Groups: Child [3]
DATE Parent Child avg_child_salary salary Callout Consec. Days with Callout
<date> <fctr> <fctr> <dbl> <dbl> <fctr> <dbl>
1 2016-12-01 A ab 500 1000 HIGH 0
2 2016-12-02 A ab 500 100 LOW 0
3 2016-12-03 A ab 500 4000 HIGH 1
4 2016-12-04 A ab 500 2000 HIGH 2
5 2016-12-01 A ac 300 1000 HIGH 0
6 2016-12-03 A ac 300 3455 HIGH 1
7 2016-12-04 A ac 300 1234 HIGH 2
8 2016-12-04 B bd 9000 600 LOW 1
最后,执行过滤器
,就像生成最终结果一样。
Finally, do the filter
as you did to generate the final result.