更新时间:2021-12-05 05:45:57
Use transform
for Series
with same size as original DataFrame
, binning with cut
and aggregate sum
:
df['DATE_LOCATION'] = pd.to_datetime(df['DATE_LOCATION'], format='%d-%m-%y %H:%M')
df = df.sort_values("DATE_LOCATION")
s = (df["DATE_LOCATION"].diff().dt.days > 1).cumsum()
count = s.groupby(s).transform('size')
print (count)
0 2
1 2
2 4
3 4
4 4
5 4
6 2
7 2
8 1
9 3
10 3
11 3
Name: DATE_LOCATION, dtype: int32
bins = pd.cut(count, bins=[0,3,7,15,31], labels=['1-3', '4-7','8-15', '>=16'])
df = df.groupby(['PRODUCT_ID', bins])['Sold'].sum().reset_index()
print (df)
PRODUCT_ID DATE_LOCATION Sold
0 0E4234 1-3 9
1 0E4234 4-7 7
2 0G2342 1-3 11