将Pandas数据框分成许多块

更新时间：2023-02-27 09:10:41

您可以基于observation列的diff()的cumsum()创建组变量，如果diff()不等于零，指定一个True值，因此每次出现一个新值时，都会使用cumsum()创建一个新的组ID，然后您可以在groupby()之后使用df.groupby((df.observation.diff() != 0).cumsum())...(other chained analysis here)应用标准分析，或将其拆分为较小的数据list-comprehension的框架:

You can create a group variable based on the cumsum() of the diff() of the observation column where if the diff() is not equal to zero, assign a True value, thus every time a new value appears, a new group id will be created with the cumsum(), and then you can either apply standard analysis after groupby() with df.groupby((df.observation.diff() != 0).cumsum())...(other chained analysis here) or split them into smaller data frames with list-comprehension:

lst = [g for _, g in df.groupby((df.observation.diff() != 0).cumsum())]

lst[0]
# observation
#d1         1
#d2         1

lst[1]
# observation
#d3        -1
#d4        -1
#d5        -1
#d6        -1
...

索引块在这里:

[i.index for i in lst]

#[Index(['d1', 'd2'], dtype='object'),
# Index(['d3', 'd4', 'd5', 'd6'], dtype='object'),
# Index(['d7', 'd8', 'd9', 'd10'], dtype='object'),
# Index(['d11', 'd12', 'd13', 'd14', 'd15'], dtype='object'),
# Index(['d16', 'd17', 'd18', 'd19', 'd20'], dtype='object')]

上一篇 : ：iPhone推送通知urbanairship下一篇 : 检查HTTP请求是否是从Android手机或不使用的servlet / JSP

将Pandas数据框分成许多块

相关阅读

技术问答最新文章