更新时间:2023-11-18 18:27:58
您可以先为整列计算 avg
,然后使用 lit()
进行添加作为 DataFrame
的变量,不需要窗口函数:
You can compute the avg
first for the whole column, then use lit()
to add it as a variable to your DataFrame
, there is no need for window functions:
from pyspark.sql.functions import lit
mean = df.groupBy().avg("dis_price_released").take(1)[0][0]
df.withColumn("test", lit(mean)).show()
+------------------+----+
|dis_price_released|test|
+------------------+----+
| 0.0| 2.5|
| 4.0| 2.5|
| 4.0| 2.5|
| 4.0| 2.5|
| 1.0| 2.5|
| 4.0| 2.5|
| 4.0| 2.5|
| 0.0| 2.5|
| 4.0| 2.5|
| 0.0| 2.5|
+------------------+----+