且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

PySpark错误:输入路径不存在

更新时间:2022-05-06 03:14:34

如果以集群模式运行,则需要在同一共享文件系统的所有节点上复制文件.然后spark会读取该文件,否则您应该使用HDFS

If your running in a clustered mode you need to copy the file across all the nodes of same shared file system. Then spark reads that file otherwise you should use HDFS

我将txt文件复制到HDFS中,而spark从HDFS中获取文件.

I copied txt file into HDFS and spark takes file from HDFS.

我将txt文件复制到所有节点的共享文件系统上,然后开始读取该文件.

I copied txt file on the shared filesystem of all nodes then spark read that file.

都为我工作