使用 data.table 标记组中的第一条(或最后一条)记录

更新时间：2021-12-28 21:07:09

这里有几个使用 data.table 的解决方案:

Here are couple of solutions using data.table:

## Option 1 (cleaner solution, added 2016-11-29)
uDT <- unique(DT)
DT[, c("first","last"):=0L]
DT[uDT, first:=1L, mult="first"]
DT[uDT, last:=1L, mult="last"]


## Option 2 (original answer, retained for posterity)
DT <- cbind(DT, first=0L, last=0L)
DT[DT[unique(DT),,mult="first", which=TRUE], first:=1L]
DT[DT[unique(DT),,mult="last", which=TRUE], last:=1L]

head(DT)
#      x y first last
# [1,] a A     1    1
# [2,] a B     1    1
# [3,] a C     1    0
# [4,] a C     0    1
# [5,] b A     1    1
# [6,] b B     1    1

显然，每一行都包含很多内容.但是，关键构造如下，它返回每个组中第一条记录的行索引:

There's obviously a lot packed into each of those lines. The key construct, though, is the following, which returns the row index of the first record in each group:

DT[unique(DT),,mult="first", which=TRUE]
# [1]  1  2  3  5  6  7 11 13 15

上一篇 : ：在 R 中使用 tryCatch() 在循环中分配错误值下一篇 : endforeach 在循环中?

使用 data.table 标记组中的第一条(或最后一条)记录

相关阅读

技术问答最新文章