更新时间:2023-08-28 16:10:58
将drop_duplicates
与subset
一起使用,并在列列表中检查重复项,并在keep='first'
上保留重复项.
Using drop_duplicates
with subset
with list of columns to check for duplicates on and keep='first'
to keep first of duplicates.
如果dataframe
是:
df = pd.DataFrame({'Column1': ["'cat'", "'toy'", "'cat'"],
'Column2': ["'bat'", "'flower'", "'bat'"],
'Column3': ["'xyz'", "'abc'", "'lmn'"]})
print(df)
结果:
Column1 Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'
2 'cat' 'bat' 'lmn'
然后:
result_df = df.drop_duplicates(subset=['Column1', 'Column2'], keep='first')
print(result_df)
结果:
Column1 Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'