且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

R DataFrame-包含多个术语的列的一种热编码

更新时间:2022-12-17 10:00:58

一个选项是将列除以后从 qdapTools 进行制表

One option is mtabulate from qdapTools after splitting the 'Info' column by ,

library(qdapTools)
cbind(mydf, mtabulate(strsplit(mydf$Info, ", ")))
#Age                      Info Target bad fun go good happy joy nice NULL okay sad wild
#1  99            good, bad, sad    Boy   1   0  0    1     0   0    0    0    0   1    0
#2  10          nice, happy, joy   Girl   0   0  0    0     1   1    1    0    0   0    0
#3  40                      NULL    Boy   0   0  0    0     0   0    0    1    0   0    0
#4  15 okay, nice, fun, wild, go    Boy   0   1  1    0     0   0    1    0    1   0    1