更新时间:2021-09-03 05:07:26
问题不是rbind
,问题是Reduce
.不幸的是,R中的函数调用非常昂贵,尤其是当您继续创建新对象时.在这种情况下,您调用rbind
65999次,每次创建一个新的R对象并添加一行.相反,您只能使用66000个参数调用一次rbind
,这将更快,因为内部rbind
将在C中进行绑定,而不必调用R函数66000次并仅分配一次内存.在这里,我们将您的Reduce
使用与Zheyuan的矩阵/未列表进行比较,最后将rbind
与使用do.call
调用一次的rbind
(do.call
允许您将所有参数指定为列表的函数)进行比较:
The problem is not rbind
, the problem is Reduce
. Unfortunately, function calls in R are expensive, and particularly so when you keep creating new objects. In this case, you call rbind
65999 times, and each time you do you create a new R object with one row added. Instead, you can just call rbind
once with 66000 arguments, which will be much faster since internally rbind
will do the binding in C without having to call R functions 66000 times and allocating the memory just once. Here we compare your Reduce
use with Zheyuan's matrix/unlist and finally with rbind
called once with do.call
(do.call
allows you to call a function with all arguments specified as a list):
out1 <- replicate(1000, 1:20, simplify=FALSE) # use 1000 elements for illustrative purposes
library(microbenchmark)
microbenchmark(times=10,
a <- do.call(rbind, out1),
b <- matrix(unlist(out1), ncol=20, byrow=TRUE),
c <- Reduce(rbind, out1)
)
# Unit: microseconds
# expr min lq
# a <- do.call(rbind, out1) 469.873 479.815
# b <- matrix(unlist(out1), ncol = 20, byrow = TRUE) 257.263 260.479
# c <- Reduce(rbind, out1) 110764.898 113976.376
all.equal(a, b, check.attributes=FALSE)
# [1] TRUE
all.equal(b, c, check.attributes=FALSE)
# [1] TRUE
浙源是最快的,但是无论从什么目的和目的来看,do.call(rbind())
方法都非常相似.
Zheyuan is the fastest, but for all intents and purposes the do.call(rbind())
method is pretty similar.