且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Spark安装-错误:找不到或加载主类org.apache.spark.launcher.Main

更新时间:2022-06-07 05:58:22

我收到了该错误消息.可能有几个根本原因,但这是我如何调查和解决问题的方法(在Linux上):

I had that error message. It probably may have several root causes but this how I investigated and solved the problem (on linux):

  • 不是启动 spark-submit ,而是尝试使用 bash -x spark-submit 来查看哪一行失败.
  • 执行该过程数次(因为spark-submit调用嵌套脚本),直到找到名为的基础过程:就我而言,它类似于:
  • instead of launching spark-submit, try using bash -x spark-submit to see which line fails.
  • do that process several times ( since spark-submit calls nested scripts ) until you find the underlying process called : in my case something like :

/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp'/opt/spark-2.2.0-bin-hadoop2.7/conf/:/opt/spark-2.2.0-bin-hadoop2.7/jars/*'-Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name'Spark shell'spark-shell

因此,spark-submit启动一个Java进程,并且使用/opt/spark-2.2.0-bin-hadoop2.7/中的文件找不到org.apache.spark.launcher.Main类.jars/* (请参见上面的-cp选项).我在这个jars文件夹中做了一个ls,并且计算了4个文件,而不是整个火花分配(〜200个文件).在安装过程中可能是一个问题.因此,我重新安装了spark,检查了jar文件夹,它的工作原理就像一个吊饰.

So, spark-submit launches a java process and can't find the org.apache.spark.launcher.Main class using the files in /opt/spark-2.2.0-bin-hadoop2.7/jars/* (see the -cp option above). I did an ls in this jars folder and counted 4 files instead of the whole spark distrib (~200 files). It was probably a problem during the installation process. So I reinstalled spark, checked the jar folder and it worked like a charm.

因此,您应该:

  • 检查 java 命令(cp选项)
  • 检查您的jars文件夹(它至少包含所有spark-*.jar吗?)

希望有帮助.