将 pandas 函数应用于列以创建多个新列?

更新时间：2022-10-24 18:08:56

根据 user1827356 的回答，您可以使用 df.merge 一次性完成分配:

df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})),left_index=True, right_index=True)textcol 功能1 功能20 0.772692 1.772692 -0.2273081 0.857210 1.857210 -0.1427902 0.065639 1.065639 -0.9343613 0.819160 1.819160 -0.1808404 0.088212 1.088212 -0.911788

请注意巨大的内存消耗和低速:https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ ！

How to do this in pandas:

I have a function extract_text_features on a single text column, returning multiple output columns. Specifically, the function returns 6 values.

The function works, however there doesn't seem to be any proper return type (pandas DataFrame/ numpy array/ Python list) such that the output can get correctly assigned df.ix[: ,10:16] = df.textcol.map(extract_text_features)

So I think I need to drop back to iterating with df.iterrows(), as per this?

UPDATE: Iterating with df.iterrows() is at least 20x slower, so I surrendered and split out the function into six distinct .map(lambda ...) calls.

UPDATE 2: this question was asked back around v0.11.0. Hence much of the question and answers are not too relevant.

Building off of user1827356 's answer, you can do the assignment in one pass using df.merge:

df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})), 
    left_index=True, right_index=True)

    textcol  feature1  feature2
0  0.772692  1.772692 -0.227308
1  0.857210  1.857210 -0.142790
2  0.065639  1.065639 -0.934361
3  0.819160  1.819160 -0.180840
4  0.088212  1.088212 -0.911788

EDIT: Please be aware of the huge memory consumption and low speed: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !

上一篇 : ：如何使用summarise_at将不同的函数应用于不同的列？下一篇 : 如何在内存中存储图像并从内存缓存中加载它们？

将 pandas 函数应用于列以创建多个新列?

相关阅读

技术问答最新文章