且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在pandas数据框上使用polyfit,然后将结果添加到新列中

更新时间:2023-11-30 22:37:16

您可以groupby并在每个组中应用拟合.首先,设置索引,以便以后可以避免合并.

You can groupby and apply the fit within each group. First, set the index so you can avoid a merge later.

import pandas as pd
import numpy as np

df = df.set_index('Id')
df['fit'] = df.groupby('Id').apply(lambda x: np.polyfit(x.x, x.y, 1))

df现在是:

          x         y                                           fit
Id                                                                 
1   0.79978  0.018255  [0.0067691538557680215, 0.01284116612923385]
1   1.19983  0.020963  [0.0067691538557680215, 0.01284116612923385]
2   2.39998  0.029006   [0.00999574968122608, 0.005016400680051043]
2   2.79995  0.033004   [0.00999574968122608, 0.005016400680051043]
3   1.79965  0.021489  [0.006761823817618233, 0.009320083766623343]
3   2.19969  0.024194  [0.006761823817618233, 0.009320083766623343]
...

如果要为每个零件分别使用单独的列,则可以应用pd.Series

If you want separate columns for each part separately, you can apply pd.Series

df[['slope', 'intercept']] = df.fit.apply(pd.Series)
df = df.drop(columns='fit').reset_index()

df现在是:

   Id        x         y     slope  intercept
0   1  0.79978  0.018255  0.006769   0.012841
1   1  1.19983  0.020963  0.006769   0.012841
2   2  2.39998  0.029006  0.009996   0.005016
3   2  2.79995  0.033004  0.009996   0.005016
4   3  1.79965  0.021489  0.006762   0.009320
5   3  2.19969  0.024194  0.006762   0.009320
6   4  1.19981  0.019338  0.007155   0.010753
7   4  1.59981  0.022200  0.007155   0.010753
8   5  1.79971  0.025629  0.007629   0.011898
9   5  2.19974  0.028681  0.007629   0.011898