更新时间:2023-11-18 23:01:04
您的声明
mask.foreach(c => base.withColumn(c, regexp_replace(col(c), "^.*?$", "*** Masked ***" ) ) )
将返回一个听起来不太好的List[org.apache.spark.sql.DataFrame]
.
will return a List[org.apache.spark.sql.DataFrame]
which doesn't sound too good.
您可以使用selectExpr
并使用:
base.show
+---+----+-----+-------+
| id|name| age|address|
+---+----+-----+-------+
| 1|abcd|12345| KT10 |
| 2|qazx|98765| AD12d|
+---+----+-----+-------+
val mask = Seq("name", "age")
val expr = df.columns.map { col =>
if (mask.contains(col) ) s"""regexp_replace(${col}, "^.*", "** Masked **" ) as ${col}"""
else col
}
这将为序列mask
Array[String] = Array(id, regexp_replace(name, "^.*", "** Masked **" ) as name, regexp_replace(age, "^.*", "** Masked **" ) as age, address)
现在您可以在生成的序列上使用selectExpr
Now you can use selectExpr
on the generated Sequence
base.selectExpr(expr: _*).show
+---+------------+------------+-------+
| id| name| age|address|
+---+------------+------------+-------+
| 1|** Masked **|** Masked **| KT10 |
| 2|** Masked **|** Masked **| AD12d|
+---+------------+------------+-------+