且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在Hive中按原始顺序选择行?

更新时间:2022-12-10 09:15:15

表中的行可能是有序的,但是...
表正在并行读取,从不同的映射器或缩减器返回的结果不是原始顺序。这就是为什么你应该知道定义原始顺序的规则。
如果您知道,那么您可以使用 row_number()排序。例如:

select * from table order by ... limit 10000;


I want to select rows from mytable in original rows with definite numbers. As we know, the key word 'limit' will randomly select rows. The rows in mytable are in order. I just want to select them in their original order. For example, to select the 10000 rows which means from row 1 to row 10000. How to realize this? Thanks.

Rows in your table may be in order but... Tables are being read in parallel, results returned from different mappers or reducers not in original order. That is why you should know the rule defining "original order". If you know then you can use row_number() or order by. For example:

select * from table order by ... limit 10000;