且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何用新列覆盖 Spark 数据框中的整个现有列?

更新时间:2023-11-18 22:34:28

可以使用

d1.withColumnRenamed("colName", "newColName")
d1.withColumn("newColName", $"colName")

withColumnRenamed 将现有列重命名为新名称.

The withColumnRenamed renames the existing column to new name.

withColumn 创建一个具有给定名称的新列.如果已经存在,它会创建一个同名的新列并删除旧的.

The withColumn creates a new column with a given name. It creates a new column with same name if there exist already and drops the old one.

在您的情况下,更改未应用于原始数据框 df2,它会更改列名称并作为新数据框返回,该数据框应分配给新变量以供进一步使用.

In your case changes are not applied to the original dataframe df2, it changes the name of column and return as a new dataframe which should be assigned to new variable for the further use.

d3 = df2.select((df2.id2 > 0).alias("id2"))

以上在您的情况下应该可以正常工作.

Above should work fine in your case.

希望这会有所帮助!