更新时间:2023-11-17 11:34:58
您可以使用以下方法实现全局排序的文件(这正是您想要的):
You can achieve a globally sorted file (which is what you basically want) using these methods:
编写自定义分区程序.Partioner是mapreduce中划分key空间的类.默认分区器(Hashpartioner)将key空间平均划分为reducer的数量.查看此示例以编写自定义分区程序.
Write a custom partitioner. Partioner is the class which divides the key space in mapreduce. The default partioner (Hashpartioner) evenly divides the key space into the number of reducers. Check out this example for writing a custom partioner.
使用 Hadoop Pig/Hive 进行排序.
Use Hadoop Pig/Hive to do sort.