且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

WithColumn:显示新列dateTime

更新时间:2023-12-01 17:41:58

input_table.withColumn 返回一个新的DataFrame.因此,要显示它:

input_table.withColumn returns a new DataFrame. So, to display it:

val dfWithToEquals = input_table.withColumn("toEquals", toEquals($"start_date",$"finish_date"))
dfWithToEquals.printSchema()
dfWithToEquals.show()

更新

要解决无法序列化的任务异常:传递给Spark的对象必须可序列化.在这里, DATE_TIME_FORMATTER 引用是在 udf 之外创建的,并且不可序列化.尝试将其实例化移动到函数内:

To resolve the Task not serializable exception: objects passed to Spark must be serializable. Here the DATE_TIME_FORMATTER reference is created outside the udf and it is not serialisable. Try to move its instantiation inside the function:

def toEquals = udf((rd1: String, rd2: String) => {
  val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
  val d1 = adjust(LocalDateTime.parse(rd1, formatter))
  val d2 = adjust(LocalDateTime.parse(rd2, formatter ), asc = false)
  // remaining code unchanged
})

更新结束