且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

根据条件从 pandas 数据框列中删除低计数

更新时间:2023-11-18 23:22:40

你可以用你的条件分配你的 value_counts 子集,然后得到那个 Series 的索引,然后用 isin 您可以检查应该在原始数据中的值,然后将值传递给原始数据帧:

You could assign you subset your value_counts with your condition then get index of that Series then with isin you could check for the values which should be in your original and then pass values to the original DataFrame:

s = df['a'].value_counts()
df[df.isin(s.index[s >= 2]).values]

工作原理:

In [133]: s.index[s >= 2]
Out[133]: Int64Index([0, 2], dtype='int64')


In [134]: df.isin(s.index[s >= 2]).values
Out[134]:
array([[ True],
       [False],
       [ True],
       [ True],
       [ True],
       [ True]], dtype=bool)


In [135]: df[df.isin(s.index[s >= 2]).values]
Out[135]:
   a
0  0
2  0
3  0
4  2
5  2