更新时间:2023-11-18 23:09:40
您可以将窗口函数与row_number
一起使用:
You can use window functions with row_number
:
import org.apache.spark.sql.functions.row_number
import org.apache.spark.sql.expressions.Window
val w = Window.partitionBy($"user_id")
val rankAsc = row_number().over(w.orderBy($"weight")).alias("rank_asc")
val rankDesc = row_number().over(w.orderBy($"weight".desc)).alias("rank_desc")
df.select($"*", rankAsc, rankDesc).filter($"rank_asc" <= 2 || $"rank_desc" <= 2)
在Spark 1.5.0中,您可以使用rowNumber
代替row_number
.
In Spark 1.5.0 you can use rowNumber
instead of row_number
.