更新时间:2023-01-11 16:55:11
我的建议:
在hdfs中创建文件夹:hadoop fs -mkdir/pigdata
加载文件到创建的hdfs文件夹:hadoop fs -put/opt/pig/tutorial/data/excite-small.log/pigdata
Load the file to the created hdfs folder: hadoop fs -put /opt/pig/tutorial/data/excite-small.log /pigdata
(或者你可以从 grunt shell 中执行 grunt> copyFromLocal/opt/pig/tutorial/data/excite-small.log/pigdata
)
(or you can do it from grunt shell as grunt> copyFromLocal /opt/pig/tutorial/data/excite-small.log /pigdata
)
执行猪拉丁脚本:
Execute the pig latin script :
grunt> set debug on
grunt> set job.name 'first-p2-job'
grunt> log = LOAD 'hdfs://hostname:54310/pigdata/excite-small.log' AS
(user:chararray, time:long, query:chararray);
grunt> grpd = GROUP log BY user;
grunt> cntd = FOREACH grpd GENERATE group, COUNT(log);
grunt> STORE cntd INTO 'output';
输出文件将存储在hdfs://hostname:54310/pigdata/output