更新时间:2023-01-16 10:50:58
首先需要提取json模式:
First you need to extract the json schema:
val schema = schema_of_json(lit(df.select($"activeGroup").as[String].first))
一旦你得到它,你就可以将你的 activegroup 列,它是一个 String 到 json (from_json
),然后 explode
它.
Once you got it, you can convert your activegroup column, which is a String to json (from_json
), and then explode
it.
一旦该列是一个 json,您就可以使用 $"columnName.field"
Once the column is a json, you can extract it's values with $"columnName.field"
val dfresult = df.withColumn("jsonColumn", explode(
from_json($"activegroup", schema)))
.select($"id", $"name",
$"jsonColumn.groupId" as "groupId",
$"jsonColumn.role" as "role",
$"jsonColumn.status" as "status")
如果你想提取整个 json 并且元素名称对你来说没问题,你可以使用 * 来做:
If you want to extract the whole json and the element names are ok to you you can use the * to do it:
val dfresult = df.withColumn("jsonColumn", explode(
from_json($"activegroup", schema)))
.select($"id", $"name", $"jsonColumn.*")
结果
+---+----+-------+-----+------+
| id|name|groupId| role|status|
+---+----+-------+-----+------+
| 1| abc| 5d|admin| A|
| 1| abc| 58|admin| A|
+---+----+-------+-----+------+