且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Apache Spark:如何使用Java在dataFrame中的空值列中插入数据

更新时间:2023-11-18 19:24:34

您需要根据 BILL_NBR 列连接两个表.

You need to join two tables based on BILL_NBR column.

假设:BILL_NBRBILL_ID 列之间存在一对一的关系.

Assumption: There is one to one relation between BILL_NBR and BILL_ID columns.

假设 File1.csv 和 File2.csv 的数据帧名称分别为 file1DFfile2DF,以下应该对您有用:

Assuming that your dataframe names for File1.csv and File2.csv are file1DF and file2DF respectively, following should work for you:

Dataset<Row> file1DF = file1DF.select("BILL_ID","BILL_NBR","BILL_NBR_TYPE_CD");
Dataset<Row> file2DF = file2DF.select("TXN_ID","TXN_TYPE","BILL_NBR_TYPE_CD","BILL_NBR");
Dataset<Row> file2DF = file2DF.join(file1DF, file1DF("BILL_NBR","BILL_NBR_TYPE_CD"));

注意:我没有资源来运行它来测试上面的代码.如果您遇到任何编译时或运行时错误,请告诉我.

Note: I haven't got resources to test above code by running it. Please let me know if you face any compile time or run time error.