更新时间:2023-11-18 21:54:46
您可以 groupby
Company 然后在 pivot
列上使用 pivot
功能强>类型
You can groupby
Company and then use pivot
function on column Type
这是一个简单的例子
import org.apache.spark.sql.functions._
val df = spark.sparkContext.parallelize(Seq(
("A", "X", "done"),
("A", "Y", "done"),
("A", "Z", "done"),
("C", "X", "done"),
("C", "Y", "done"),
("B", "Y", "done")
)).toDF("Company", "Type", "Status")
val result = df.groupBy("Company")
.pivot("Type")
.agg(expr("coalesce(first(Status), \"pending\")"))
result.show()
输出:
+-------+-------+----+-------+
|Company| X| Y| Z|
+-------+-------+----+-------+
| B|pending|done|pending|
| C| done|done|pending|
| A| done|done| done|
+-------+-------+----+-------+
您可以稍后重命名该列.
You can rename the column later.
希望这会有所帮助!