且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何基于其他列的排列在数据框中创建新列?

更新时间:2023-10-22 14:06:10

您可以执行以下操作

## paste the rows together, creating a character vector
x <- do.call(paste, df)
## match it against itself and apply to 'LETTERS', and assign as new column
df$category <- LETTERS[match(x, x)]
df
#    var1  var2  var3  var4 category
# a  TRUE FALSE  TRUE FALSE        A
# b  TRUE  TRUE  TRUE FALSE        B
# c FALSE  TRUE FALSE  TRUE        C
# d  TRUE FALSE FALSE FALSE        D
# e  TRUE FALSE  TRUE FALSE        A
# f FALSE  TRUE FALSE  TRUE        C

如果我们使用命名列表作为环境,那么上面的代码可以单行编写。这样可以避免对全球环境进行任何新的分配。

The above code can be written as a one-liner if we use a named list as an environment. This avoids making any new assignments to the global environment.

df$category <- LETTERS[with(list(x = do.call(paste, df)), match(x, x))]

数据:

df <- structure(list(var1 = c(TRUE, TRUE, FALSE, TRUE, TRUE, FALSE), 
    var2 = c(FALSE, TRUE, TRUE, FALSE, FALSE, TRUE), var3 = c(TRUE, 
    TRUE, FALSE, FALSE, TRUE, FALSE), var4 = c(FALSE, FALSE, 
    TRUE, FALSE, FALSE, TRUE)), .Names = c("var1", "var2", "var3", 
"var4"), row.names = c("a", "b", "c", "d", "e", "f"), class = "data.frame")