更新时间:2023-11-18 23:09:04
Setitem语法现在可以在dask.dataframe中使用
Setitem syntax now works in dask.dataframe
df['z'] = df.x + df.y
您正确的说,setitem语法在 dask.dataframe
.
df['c'] = ... # mutation not supported
如您所建议,您应该改用.assign(...)
.
As you suggest you should instead use .assign(...)
.
df = df.assign(c=df.a + df.b)
在您的示例中,您不必要地调用了.compute()
.通常,您只想在获得最终结果后才在最后调用计算.
In your example you have an unnecessary call to .compute()
. Generally you want to call compute only at the very end, once you have your final result.
和以前一样,dask.dataframe
不支持更改行.就并行代码而言,就地操作很难进行推理.目前,在这种情况下,dask.dataframe
没有很好的替代操作.我提出了问题#653 来讨论该主题.
As before, dask.dataframe
does not support changing rows in place. Inplace operations are difficult to reason about in parallel codes. At the moment dask.dataframe
has no nice alternative operation in this case. I've raised issue #653 for conversation on this topic.