且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

hadoop 测试第一个mapreduce程序

更新时间:2022-10-03 16:33:13

说明:测试hadoop自带的实例 wordcount程序(此程序统计每个单词在文件中出现的次数)

2.6.0版本jar程序的路径是

/usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar



一、在本地创建目录和文件

创建目录:

mkdir /home/hadoop/input

cd /home/hadoop/input

创建文件:

touch wordcount1.txt

touch wordcount2.txt

二、添加内容

echo "Hello World" > wordcount1.txt

echo "Hello Hadoop" > wordcount2.txt


三、在hdfs上创建input目录

hadoop fs -mkdir /input


四、拷贝文件到/input目录

hadoop fs -put /home/hadoop/input/* /input


五、执行程序

hadoop jar /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output


说明:wordcount为程序的主类名, /input  输入目录  /output 输出目录(输出目录不能存在)


六、执行过程信息

15/04/14 15:55:03 INFO client.RMProxy: Connecting to ResourceManager at hdnn140/192.168.152.140:8032

15/04/14 15:55:04 INFO input.FileInputFormat: Total input paths to process : 2

15/04/14 15:55:04 INFO mapreduce.JobSubmitter: number of splits:2

15/04/14 15:55:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1428996061278_0002

15/04/14 15:55:05 INFO impl.YarnClientImpl: Submitted application application_1428996061278_0002

15/04/14 15:55:05 INFO mapreduce.Job: The url to track the job: http://hdnn140:8088/proxy/application_1428996061278_0002/

15/04/14 15:55:05 INFO mapreduce.Job: Running job: job_1428996061278_0002

15/04/14 15:55:17 INFO mapreduce.Job: Job job_1428996061278_0002 running in uber mode : false

15/04/14 15:55:17 INFO mapreduce.Job:  map 0% reduce 0%

15/04/14 15:56:00 INFO mapreduce.Job:  map 100% reduce 0%

15/04/14 15:56:10 INFO mapreduce.Job:  map 100% reduce 100%

15/04/14 15:56:11 INFO mapreduce.Job: Job job_1428996061278_0002 completed successfully

15/04/14 15:56:11 INFO mapreduce.Job: Counters: 49

        File System Counters

                FILE: Number of bytes read=55

                FILE: Number of bytes written=316738

                FILE: Number of read operations=0

                FILE: Number of large read operations=0

                FILE: Number of write operations=0

                HDFS: Number of bytes read=235

                HDFS: Number of bytes written=25

                HDFS: Number of read operations=9

                HDFS: Number of large read operations=0

                HDFS: Number of write operations=2

        Job Counters 

                Launched map tasks=2

                Launched reduce tasks=1

                Data-local map tasks=2

                Total time spent by all maps in occupied slots (ms)=83088

                Total time spent by all reduces in occupied slots (ms)=7098

                Total time spent by all map tasks (ms)=83088

                Total time spent by all reduce tasks (ms)=7098

                Total vcore-seconds taken by all map tasks=83088

                Total vcore-seconds taken by all reduce tasks=7098

                Total megabyte-seconds taken by all map tasks=85082112

                Total megabyte-seconds taken by all reduce tasks=7268352

        Map-Reduce Framework

                Map input records=2

                Map output records=4

                Map output bytes=41

                Map output materialized bytes=61

                Input split bytes=210

                Combine input records=4

                Combine output records=4

                Reduce input groups=3

                Reduce shuffle bytes=61

                Reduce input records=4

                Reduce output records=3

                Spilled Records=8

                Shuffled Maps =2

                Failed Shuffles=0

                Merged Map outputs=2

                GC time elapsed (ms)=1649

                CPU time spent (ms)=4260

                Physical memory (bytes) snapshot=280866816

                Virtual memory (bytes) snapshot=2578739200

                Total committed heap usage (bytes)=244625408

        Shuffle Errors

                BAD_ID=0

                CONNECTION=0

                IO_ERROR=0

                WRONG_LENGTH=0

                WRONG_MAP=0

                WRONG_REDUCE=0

        File Input Format Counters 

                Bytes Read=25

        File Output Format Counters 

                Bytes Written=25


七、完成后查看输出目录

hadoop fs -ls /output


八、查看输出结果

hadoop fs -cat /output/part-r-00000


九、完成











本文转自 yntmdr 51CTO博客,原文链接:http://blog.51cto.com/yntmdr/1632323,如需转载请自行联系原作者