更新时间:2021-12-29 22:00:00
使用 DataFrame.stack
删除缺失值,然后通过
Use DataFrame.set_index
with DataFrame.stack
for remove missing values, then create indicators by get_dummies
and return 1/0
by max
by first level, last convert columns to integers:
df1 = (pd.get_dummies(df.set_index('ID').stack())
.max(level=0)
.rename(columns=int)
.reset_index())
print (df1)
ID 1 2 3 4 5 10 20
0 1 1 1 0 0 1 1 1
1 2 1 0 1 1 0 1 0
print (df)
ID 0 1 2 3 4 5
0 1 10 20 5.0 1 2 5
1 2 3 4 NaN 10 1 2
如果使用max
,则始终在输出中显示0/1
值(选中5列):
If use max
then always in output are 0/1
values (check 5 column):
df1 = (pd.get_dummies(df.set_index('ID').stack())
.max(level=0)
.rename(columns=int)
.reset_index())
print (df1)
ID 1 2 3 4 5 10 20
0 1 1 1 0 0 1 1 1
1 2 1 1 1 1 0 1 0
但是如果使用sum
,它会计算值(检查5列):
But if use sum
it count values (check 5 column):
df2 = (pd.get_dummies(df.set_index('ID').stack())
.sum(level=0)
.rename(columns=int)
.reset_index())
print (df2)
ID 1 2 3 4 5 10 20
0 1 1 1 0 0 2 1 1
1 2 1 1 1 1 0 1 0