如何使用 PySpark 的 JDBC 覆盖数据而不会丢失架构?

更新时间：2023-01-19 14:28:02

mode="overwrite" 的默认行为是先删除表，然后用新数据重新创建它.您可以通过包含 option("truncate", "true") 来截断数据，然后推送您自己的数据:

The default behavior for mode="overwrite" is to first delete the table, then recreate it with the new data. You can instead truncate the data by including option("truncate", "true") and then push your own:

df.write.option("truncate", "true").jdbc(url=DATABASE_URL, table=DATABASE_TABLE, mode="overwrite", properties=DATABASE_PROPERTIES)

这样，您就不会重新创建表，因此它不应对您的架构进行任何修改.

This way, you are not recreating the table so it shouldn't make any modifications to your schema.

上一篇 : ：如何在Google Colab中运行shell（终端）？下一篇 : “未找到默认活动"关于 Android Studio 升级

如何使用 PySpark 的 JDBC 覆盖数据而不会丢失架构?

相关阅读

技术问答最新文章