且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

R:根据一个列的值存在于另一列中,生成一个虚拟变量

更新时间:2022-12-10 17:55:24

看起来 A 列是一串用逗号分隔的数字,因此%in%不合适(例如,如果您在多个字符串的向量中检查了 B ,或者如果 A B 是数字).如果您的数据框架结构不同,请告诉我(并随时编辑您的问题).

It looks like column A is a string of numbers separated by commas, so %in% would not be appropriate (it would be helpful if, for example, you checked for B inside a vector of multiple strings, or numbers if A and B were numeric). If your data frame structure is different, please let me know (and feel free to edit your question).

您可能可以通过多种方式完成此操作.也许一种简单的方法是一次使用 grepl 行,以识别 A 中是否存在 B 列.

You probably could accomplish this multiple ways. Perhaps an easy way is to use grepl one row at a time to identify if column B is present in A.

library(tidyverse)

df %>%
  rowwise() %>%
  mutate(dummy = +grepl(B, A))

输出

# A tibble: 5 x 3
  A              B     dummy
  <fct>          <fct> <int>
1 2012,2013,2014 2011      0
2 2012,2013,2014 2012      1
3 2012,2013,2014 2013      1
4 2012,2013,2014 2014      1
5 2012,2013,2014 2015      0

数据

df <- data.frame(
  A = c(rep("2012,2013,2014", 5)),
  B = c("2011", "2012", "2013", "2014", "2015")
)