且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

连续行之间的日期差异 - Pyspark Dataframe

更新时间:2023-01-20 09:27:33

像这样:

df.registerTempTable("df")

sqlContext.sql("""
     SELECT *, CAST(date AS bigint) - CAST(lag(date, 1) OVER (
              PARTITION BY user_id ORDER BY date) AS bigint) 
     FROM df""")