计算R中的一列中子串的出现次数

更新时间：2022-12-10 17:37:56

你也可以使用我的splitstackshape包中的 cSplit（）。因为这个包也加载了data.table，所以你可以使用 dcast（）来列表结果。

  library（splitstackshape）
 cSplit（mydf，String，direction =long） [，dcast（.SD，village〜String）] 
＃使用'村庄'作为值栏。使用'value.var'覆盖
＃缺少聚合函数，默认为'length'
＃village fd_sec ht_rm san不适用
＃1：A 1 2 0 1 
＃2 ：B 1 0 0 0 
＃3：C 0 1 1 0

I would like to count the occurrences of a string in a column ....per group. In this case the string is often a substring in a character column.

I have some data e.g.

ID   String              village
1    fd_sec, ht_rm,      A
2    NA, ht_rm           A
3    fd_sec,             B
4    san, ht_rm,         C

The code that I began with is obviously incorrect, but I am failing on my search to find out I could use the grep function in a column and group by village

impacts <- se %>%  group_by(village) %>%
summarise(c_NA = round(sum(sub$en41_1 ==  "NA")),
          c_ht_rm = round(sum(sub$en41_1 ==  "ht_rm")),
          c_san = round(sum(sub$en41_1 ==  "san")),
          c_fd_sec = round(sum(sub$en41_1 ==  "fd_sec")))

Ideally my output would be:

village  fd_sec  NA  ht_rm  san
A        1       1   2 
B        1
C                    1      1

Thank you in advance

You can also use cSplit() from my "splitstackshape" package. Since this package also loads "data.table", you can then just use dcast() to tabulate the result.

Example:

library(splitstackshape)
cSplit(mydf, "String", direction = "long")[, dcast(.SD, village ~ String)]
# Using 'village' as value column. Use 'value.var' to override
# Aggregate function missing, defaulting to 'length'
#    village fd_sec ht_rm san NA
# 1:       A      1     2   0  1
# 2:       B      1     0   0  0
# 3:       C      0     1   1  0

上一篇 : ：MySQL-行到列下一篇 : 两个随机数可除

计算R中的一列中子串的出现次数

相关阅读

技术问答最新文章