且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

从 R 中的 data.table 有条件地删除行

更新时间:2022-12-20 11:40:18

在这种情况下,它与 data.frame

data <- data[ menuitem != 'coffee' | amount > 0] 

通过引用删除/添加行将被实现.您可以在 this问题

Delete/add row by reference it is to be implemented. You find more info in this question

关于速度:

1 您可以通过执行以下操作从密钥中受益:

1 You can benefit from keys by doing something like:

setkey(data, menuitem)
data <- data[!"coffee"]

这将比 data <- data[ menuitem != 'coffee'] 更快.但是,要应用您在问题中提出的相同过滤器,您需要滚动加入(我已经完成了午休时间,我可以稍后添加一些东西:-)).

which will be faster than data <- data[ menuitem != 'coffee']. However to apply the same filters you asked in the question you'll need a rolling join (I've finished my lunch break I can add something later :-)).

2 即使没有 key data.table 对于相对较大的表来说也更快(对于少数行的速度相似)

2 Even without key data.table is much faster for relatively big table (similar speed for handful amount of rows)

dt<-data.table(id=sample(letters,1000000,T),var=rnorm(1000000))
df<-data.frame(id=sample(letters,1000000,T),var=rnorm(1000000))
library(microbenchmark)
> microbenchmark(dt[ id == "a"], df[ df$id == "a",])
Unit: milliseconds
               expr       min        lq    median        uq       max neval
      dt[id == "a"]  24.42193  25.74296  26.00996  26.35778  27.36355   100
 df[df$id == "a", ] 138.17500 146.46729 147.38646 149.06766 154.10051   100