更新时间:2023-02-26 18:12:39
您可以对值进行排序,然后对groupby
:
You can sort the values, then groupby
:
a= np.sort(df.to_numpy(), axis=1)
df.groupby([a[:,0], a[:,1]], as_index=False, sort=False).first()
选项2 :如果c1, c2
对很多,groupby
可能会变慢.在这种情况下,我们可以分配新值并按drop_duplicates
进行过滤:
Option 2: If you have a lot of pairs c1, c2
, groupby
can be slow. In that case, we can assign new values and filter by drop_duplicates
:
a= np.sort(df.to_numpy(), axis=1)
(df.assign(one=a[:,0], two=a[:,1]) # one and two can be changed
.drop_duplicates(['one','two']) # taken from above
.reindex(df.columns, axis=1)
)