且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

hadoop模式下的Mrjob:启动作业时出错,错误的输入路径:文件不存在

更新时间:2022-01-11 02:49:43

和文件 hdfs-site.xml 为:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/edureka/hadoop-2.7.3/namenode</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/edureka/hadoop-2.7.3/datanode</value>
    </property>
</configuration>

,您需要将 hdfs-site.xml 编辑为:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/edureka/hadoop-2.7.3/datanode</value>
    </property>
</configuration>

,您需要创建一个包含内容的 mapred-site.xml 文件:

and you need to create a mapred-site.xml file with content:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

,您需要编辑 yarn-site.xml 以包含:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>

然后做:

start-dfs.sh
start-yarn.sh

然后做:

hdfs dfs -mkdir /user/
hdfs dfs -mkdir /user/me/
hdfs dfs -mkdir /user/me/input/
hdfs dfs -put /home/me/Desktop/work/cv/hadoop/salaries.csv /user/me/input/

现在正在做

sudo chmod a+x /home/me/Desktop/work/cv/hadoop/top_salaries.py
python2 top_salaries.py -r hadoop  hdfs:///user/me/input/salaries.csv > answer.csv

有效.

这篇关于hadoop模式下的Mrjob:启动作业时出错,错误的输入路径:文件不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

上岸,阿里云!