且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Spark Scala中的SaveAsTable:HDP3.x

更新时间:2023-11-18 14:27:40

查看以下解决方案是否对您有用,

See if below solution works for you,

val df3 = df1.join(df2, df1("inv_num") === df2("inv_num")  // Join both dataframes on id column
    ).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary")) 
    .otherwise(
    when(df1("salary") > df2("salary"), df1("salary") + df2("salary"))  // 5000+3000=8000  check
    .otherwise(df2("salary"))))    // insert from second dataframe
    .drop(df1("salary"))
    .drop(df2("salary"))
    .withColumnRenamed("finalSalary","salary")

val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()

df3.createOrReplaceTempView("<temp-tbl-name>")
hive.setDatabase("<db-name>")
hive.createTable("<tbl-name>")
.ifNotExists()

sql("SELECT salary FROM <temp-tbl-name>")
.write
.format(HIVE_WAREHOUSE_CONNECTOR)
.mode("append")
.option("table", "<tbl-name>")
.save()