更新时间:2023-11-22 23:02:40
我想发布我的最终解决方案:
I want post my final solution:
val finalDf = originalDf.withColumn("name", maxValAsMap(keys, values)).select("cookie_id", "max_column")
val maxValAsMap = udf((keys: Seq[String], values: Seq[Any]) => {
val valueMap:Map[String,Double] = (keys zip values).filter( _._2.isInstanceOf[Double] ).map{
case (x,y) => (x, y.asInstanceOf[Double])
}.toMap
if (valueMap.isEmpty) "not computed" else valueMap.maxBy(_._2)._1
})
它的工作速度非常快.