更新时间:2022-06-19 09:24:18
Use groupby.first
;它需要 第一个非 NA 值:
df.groupby('hostname')[['period', 'Teff']].first().reset_index()
# hostname period Teff
#0 Cnc 44.3787 5234
#1 Peg 4.2293 5773
#2 Vir 38.0210 5577
或者使用自定义聚合函数手动执行此操作:
Or manually do this with a custom aggregation function:
df.groupby('hostname')[['period', 'Teff']].agg(lambda x: x.dropna().iat[0]).reset_index()
这要求每组至少有一个非 NA 值.
This requires each group has at least one non NA value.
编写自己的函数来处理边缘情况:
Write your own function to handle the edge case:
def first_(g):
non_na = g.dropna()
return non_na.iat[0] if len(non_na) > 0 else pd.np.nan
df.groupby('hostname')[['period', 'Teff']].agg(first_).reset_index()
# hostname period Teff
#0 Cnc 44.3787 5234
#1 Peg 4.2293 5773
#2 Vir 38.0210 5577