python：删除包含字符串的 pandas 数据帧中的所有行

更新时间：2023-09-03 18:29:10

您可以应用一个功能，按行行方式测试您的 DataFrame 表示存在字符串，例如说 df 是您的 DataFrame

  rows_with_strings = df.apply（
 lambda row：
 any（[isinstance（e，basestring）for e in row]）
，axis = 1）

这将为您的DataFrame创建一个掩码，指出哪些行包含至少一个字符串。因此，您可以通过相对的掩码选择没有字符串的行。

  df_with_no_strings = df [〜rows_with_strings]

。

示例：

  a = [[1,2]，['a'，2]，[3,4]，[7，'d' ]] 
 df = pd.DataFrame（a，columns = ['a'，'b']）


 df 
ab 
 0 1 2 
 1 a 2 
 2 3 4 
 3 7 d 

 select = df.apply（lambda r：any（[isinstance（e，basestring）for e in r]），轴= 1）

 df [〜select] 

ab 
 0 1 2 
 2 3 4

I've got a pandas dataframe called data and I want to remove all rows that contain a string in any column. For example, below we see the 'gdp' column has a string at index 3, and 'cap' at index 1.

data =

    y  gdp  cap
0   1    2    5
1   2    3    ab
2   8    7    2
3   3    bc   7
4   6    7    7
5   4    8    3
...

I've been trying to use something like this script because I will not know what is contained in exp_list ahead of time. Unfortunately, "data.var_name" throws out this error: 'DataFrame' object has no attribute 'var_name'. I also don't know what the strings will be ahead of time so is there anyway to generalize that as well?

exp_list = ['gdp', 'cap']

for var_name in exp_list:
    data = data[data.var_name != 'ab']

You can apply a function that tests row-wise your DataFrame for the presence of strings, e.g., say that df is your DataFrame

 rows_with_strings  = df.apply(
       lambda row : 
          any([ isinstance(e, basestring) for e in row ])
       , axis=1)

This will produce a mask for your DataFrame indicating which rows contain at least one string. You can hence select the rows without strings through the opposite mask

 df_with_no_strings = df[~rows_with_strings]

Example:

 a = [[1,2],['a',2], [3,4], [7,'d']]
 df = pd.DataFrame(a,columns = ['a','b'])


 df 
   a  b
0  1  2
1  a  2
2  3  4
3  7  d

select  = df.apply(lambda r : any([isinstance(e, basestring) for e in r  ]),axis=1) 

df[~select]                                                                                                                                

    a  b
 0  1  2
 2  3  4

上一篇 : ：如何优化 vlookup 以获得高搜索次数?(VLOOKUP 的替代品)下一篇 : 通过Linq到XML获取父节点的属性值

python：删除包含字符串的 pandas 数据帧中的所有行

相关阅读

推荐文章