更新时间:2022-06-18 03:51:31
使用spark-submit
时,应将jar
提供给执行者.
When using spark-submit
you should supply the jar
to the executors.
要开始使用,您将需要包括JDBC驱动程序 spark类路径上的特定数据库.例如,连接到 在Spark Shell中使用postgres,您可以运行以下命令:
To get started you will need to include the JDBC driver for you particular database on the spark classpath. For example, to connect to postgres from the Spark Shell you would run the following command:
bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9.4.1207.jar
注意:
spark-submit
命令应该与此相同
Note: The same should be for
spark-submit
command
问题排查
JDBC驱动程序类对于原始类加载器必须是可见的 在客户端会话和所有执行者上.这是因为Java DriverManager类执行安全检查,导致其忽略 原始类加载器不可见的所有驱动程序 打开连接. 一种方便的方法是修改 在所有工作节点上 compute_classpath.sh包括驱动程序JAR.
The JDBC driver class must be visible to the primordial class loader on the client session and on all executors. This is because Java’s DriverManager class does a security check that results in it ignoring all drivers not visible to the primordial class loader when one goes to open a connection. One convenient way to do this is to modify compute_classpath.sh on all worker nodes to include your driver JARs.