且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在数据框其他列的一列中搜索字符串

更新时间:2022-12-12 09:32:17

grepl 将与一起使用mapply

示例数据框:

title <- c('eggs and bacon','sausage biscuit','pancakes')
description <- c('scrambled eggs and thickcut bacon','homemade biscuit with breakfast pattie', 'stack of sourdough pancakes')
keyword <- c('bacon','sausage','sourdough')
df <- data.frame(title, description, keyword, stringsAsFactors=FALSE)

使用 grepl 搜索匹配项:

df$exists_in_title <- mapply(grepl, pattern=df$keyword, x=df$title)
df$exists_in_description <- mapply(grepl, pattern=df$keyword, x=df$description)

结果:

            title                            description   keyword exists_in_title exists_in_description
1  eggs and bacon      scrambled eggs and thickcut bacon     bacon            TRUE                  TRUE
2 sausage biscuit homemade biscuit with breakfast pattie   sausage            TRUE                 FALSE
3        pancakes            stack of sourdough pancakes sourdough           FALSE                  TRUE



更新I



您也可以使用 dplyr stringr

library(dplyr)
df %>% 
  rowwise() %>% 
  mutate(exists_in_title = grepl(keyword, title),
         exists_in_description = grepl(keyword, description))

library(stringr)
df %>% 
  rowwise() %>% 
  mutate(exists_in_title = str_detect(title, keyword),
         exists_in_description = str_detect(description, keyword))   



更新II



地图也是一个选项,或者使用 tidyverse 中的更多选项,另一个选项可以是 pu rrr stringr

Update II

Mapis also an option, or using more from tidyverse another option could be purrr with stringr:

library(tidyverse)
df %>%
  mutate(exists_in_title = unlist(Map(function(x, y) grepl(x, y), keyword, title))) %>% 
  mutate(exists_in_description = map2_lgl(description, keyword,  str_detect))