更新时间:2023-12-01 17:41:58
input_table.withColumn
返回一个新的DataFrame.因此,要显示它:
input_table.withColumn
returns a new DataFrame. So, to display it:
val dfWithToEquals = input_table.withColumn("toEquals", toEquals($"start_date",$"finish_date"))
dfWithToEquals.printSchema()
dfWithToEquals.show()
更新
要解决无法序列化的任务
异常:传递给Spark的对象必须可序列化.在这里, DATE_TIME_FORMATTER
引用是在 udf
之外创建的,并且不可序列化.尝试将其实例化移动到函数内:
To resolve the Task not serializable
exception: objects passed to Spark must be serializable. Here the DATE_TIME_FORMATTER
reference is created outside the udf
and it is not serialisable. Try to move its instantiation inside the function:
def toEquals = udf((rd1: String, rd2: String) => {
val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
val d1 = adjust(LocalDateTime.parse(rd1, formatter))
val d2 = adjust(LocalDateTime.parse(rd2, formatter ), asc = false)
// remaining code unchanged
})
更新结束