spark - scala:不是 org.apache.spark.sql.Row 的成员

更新时间：2023-11-18 14:54:04

当你将 DataFrame 转换为 RDD 时，你会得到一个 RDD[Row]，所以当你使用 map 时code>，您的函数接收一个 Row 作为参数.因此，必须使用Row方法来访问其成员(注意索引从0开始):

When you convert a DataFrame to RDD, you get an RDD[Row], so when you use map, your function receives a Row as parameter. Therefore, you must use the Row methods to access its members (note that the index starts from 0):

df.rdd.map { 
  row: Row => (row.getString(1) + "_" + row.getString(2), row)
}.take(5)

您可以在 Spark scaladoc.

You can view more examples and check all methods available for Row objects in the Spark scaladoc.

我不知道您执行此操作的原因，但是为了连接 DataFrame 的 String 列，您可以考虑以下选项:

I don't know the reason why you are doing this operation, but for concatenating String columns of a DataFrame you may consider the following option:

import org.apache.spark.sql.functions._
val newDF = df.withColumn("concat", concat(df("col2"), lit("_"), df("col3")))

上一篇 : ：在 SQLITE 中使用窗口函数下一篇 : 一次将一个图像拖到另一图像的顶部

spark - scala:不是 org.apache.spark.sql.Row 的成员

相关阅读

推荐文章