且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

将缺失的时间行插入数据框中

更新时间:2023-02-06 13:40:41

你可以试试merge/expand.grid

 res <- merge(
          expand.grid(group=unique(df$group), time=unique(df$time)),
                                     df, all=TRUE)
 res$data[is.na(res$data)] <- 0
 res
 #  group time data
 #1     A    1    5
 #2     A    2    6
 #3     A    3    0
 #4     A    4    7
 #5     B    1    8
 #6     B    2    9
 #7     B    3   10
 #8     B    4    0

或者使用data.table

 library(data.table)
 setkey(setDT(df), group, time)[CJ(group=unique(group), time=unique(time))
                     ][is.na(data), data:=0L]
 #    group time data
 #1:     A    1    5
 #2:     A    2    6
 #3:     A    3    0
 #4:     A    4    7
 #5:     B    1    8
 #6:     B    2    9
 #7:     B    3   10
 #8:     B    4    0

更新

正如评论中提到的@thelatemail,如果所有组中都不存在特定的时间"值,上述方法将失败.可能这会更笼统.

Update

As @thelatemail mentioned in the comments, the above method would fail if a particular 'time' value is not present in all the groups. May be this would be more general.

 res <- merge(
          expand.grid(group=unique(df$group), 
                      time=min(df$time):max(df$time)),
                                     df, all=TRUE)
 res$data[is.na(res$data)] <- 0

并在 data.table 解决方案中将 time=unique(time) 替换为 time= min(time):max(time).

and similarly replace time=unique(time) with time= min(time):max(time) in the data.table solution.