且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

从数据帧的第n列获取值,每行的n个不同

更新时间:2023-12-01 09:47:52

您可以按照2列矩阵进行索引 - 第一列是行号,第二列是列号。

  df [cbind(seq(cl),cl)] 
#[1] 100 310 320 230 140
pre>

这是一个向量化的操作,应该比循环遍历具有类似 sapply 的行更快,并抓取该行的适当值:

 #稍微更大的例子,1000行
set.seed(144)
df< - matrix(rnorm(3000),nrow = 1000)
cl< - sample(3,1000,replace = TRUE)
all.equal(df [cbind ),b(b)(c)),b(b)(b)(b) $ b microbenchmark(df [cbind(seq(cl),cl)],sapply(seq(nrow(df)),function(i)df [i,cl [i]])
#单位:微秒
#expr min lq平均值
#df [cbind(seq(cl),cl)] 23.828 26.335 34.26012 30.0350
#sapply(seq(nfd(df) (i)df [i,cl [i]])855.481 922.449 1178.47502 996.3815
#uq max neval
#38.0315 135.894 100
#1111.3960 3414.374 100


How do I construct a vector of values from nth column of some data frame, where n is a per-row value defined in some vector? Example:

> df <- data.frame(a=c(100, 110, 120, 130, 140),
                   b=c(200, 210, 220, 230, 240),
                   c=c(300, 310, 320, 330, 340))
> df
    a   b   c
1 100 200 300
2 110 210 310
3 120 220 320
4 130 230 330
5 140 240 340
> cl <- c(1, 3, 3, 2, 1)
> some.function(df, cl)

would result in:

[1] 100 310 320 230 140

You can index by a 2-column matrix -- the first column is the row number and the second is the column number.

df[cbind(seq(cl), cl)]
# [1] 100 310 320 230 140

This is a vectorized operation that should be quicker than looping through the rows with something like sapply and grabbing the appropriate value from that row:

# Slightly larger example, with 1000 rows
set.seed(144)
df <- matrix(rnorm(3000), nrow=1000)
cl <- sample(3, 1000, replace=TRUE)
all.equal(df[cbind(seq(cl), cl)], sapply(seq(nrow(df)), function(i) df[i, cl[i]]))
# [1] TRUE
library(microbenchmark)
microbenchmark(df[cbind(seq(cl), cl)], sapply(seq(nrow(df)), function(i) df[i, cl[i]]))
# Unit: microseconds
#                                             expr     min      lq       mean   median
#                           df[cbind(seq(cl), cl)]  23.828  26.335   34.26012  30.0350
#  sapply(seq(nrow(df)), function(i) df[i, cl[i]]) 855.481 922.449 1178.47502 996.3815
#         uq      max neval
#    38.0315  135.894   100
#  1111.3960 3414.374   100