更新时间:2023-01-28 19:48:43
您可以使用此处描述的 rank() UDF 来实现:http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
You can do it with a rank() UDF described here: http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
SELECT page-id, user-id, clicks
FROM (
SELECT page-id, user-id, rank(user-id) as rank, clicks
FROM mytable
DISTRIBUTE BY page-id, user-id
SORT BY page-id, user-id, clicks desc
) a
WHERE rank < 5
ORDER BY page-id, rank