更新时间:2023-11-18 20:55:04
您似乎正在寻找的功能是将结构收集到一个数组中.Hive 带有两个用于将事物收集到数组中的函数:collect_set 和 collect_list.但是,这些函数只能用于创建基本类型的数组.
The functionality that you seem to be looking for is to collect the structs into an array. Hive comes with two functions for collecting things into arrays: collect_set and collect_list. However, those functions only work to create arrays of basic types.
brickhouse 项目的 jar (https://github.com/klout/brickhouse/wiki/Downloads) 提供了许多功能,包括收集复杂类型的能力.
The jar for the brickhouse project (https://github.com/klout/brickhouse/wiki/Downloads) provides a number of features, including the ability to collect complex types.
add jar hdfs://path/to/your/jars/brickhouse-0.6.0.jar
然后您可以使用您喜欢的任何名称添加 collect
函数:
Then you can add the collect
function using whatever name you like:
create temporary function collect_struct as 'brickhouse.udf.collect.CollectUDAF';
以下查询:
select id
, collect_struct(
named_struct(
"field_id", fieldid,
"field_label", fieldlabel,
"field_type", fieldtype,
"answer_id", answer_id)) as answers
, unitname
from new_answers
group by id, unitname
;
提供以下结果:
id answers unitname
1 [{"field_id":175877,"field_label":"Comment","field_type":"COMMENT","answer_id":8990947803}] Location1
2 [{"field_id":47824,"field_label":"Language","field_type":"MULTIPLE_CHOICE","answer_id":8990950069},{"field_id":48187,"field_label":"Language Type","field_type":"MULTIPLE_CHOICE","answer_id":8990950070},{"field_id":47829,"field_label":"Trans #","field_type":"TEXT","answer_id":8990950071}] Location2