且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用JSON数据类型的记录嵌套数组查询组合

更新时间:2022-01-27 22:45:50

给出此表(您应该以这种形式提供该表):

Given this table (which you should have provided in a form like this):

CREATE TABLE reports (rep_id int primary key, data json);
INSERT INTO reports (rep_id, data)
VALUES 
  (1, '{"objects":[{"album": 1, "src":"fooA.png", "pos": "top"},   {"album": 2, "src":"barB.png", "pos": "top"}],   "background":"background.png"}')
, (2, '{"objects":[{"album": 1, "src":"fooA.png", "pos": "top"},   {"album": 2, "src":"barC.png", "pos": "top"}],   "background":"bacakground.png"}')
, (3, '{"objects":[{"album": 1, "src":"fooA.png", "pos": "middle"},{"album": 2, "src":"barB.png", "pos": "middle"}],"background":"background.png"}')
, (4, '{"objects":[{"album": 1, "src":"fooA.png", "pos": "top"},   {"album": 3, "src":"barB.png", "pos": "top"}],   "background":"backgroundA.png"}')
;

众所周知的可翻译类型的JSON记录

使用 json_populate_recordset() 用于取消嵌套记录集"objects". 该函数需要注册的行类型来定义结果列的名称和数据类型.出于本演示目的或通常用于即席查询的目的,在"objects"之后建模的临时表提供了相同的条件:

JSON records of well known, translatable type

Use json_populate_recordset() for unnesting the recordset "objects". The function requires a registered row type to define the names and data types of resulting columns. For the purpose of this demo or generally for ad-hoc queries, a temp table modeled after "objects" provides the same:

CREATE TEMP TABLE obj(album int, src text, pos text);

要找到the top 3 most common combinations ... of entries that have the same album, src, and background:

SELECT array_agg(r.rep_id) AS ids, count(*) AS ct
FROM   reports r
     , json_populate_recordset(null::obj, r.data->'objects') o
GROUP  BY r.data->>'background'
        , o.album
        , o.scr
ORDER  BY count(*) DESC
LIMIT  3;

每个对象都计数,无论是否来自同一行.您没有定义如何精确地处理它.因此,rep_id可以在数组ids中多次弹出.将DISTINCT添加到array_agg()以折叠可能的重复项.在这种情况下,计数ct可以大于数组ids的长度.

Each object counts, no matter whether from the same row or not. You did not define how to handle that exactly. Consequently, rep_id can pop up multiple times in the array ids. Add DISTINCT to array_agg() to fold possible duplicates. The count ct can be greater then the length of the array ids in this case.

需要Postgres 9.3的JSON函数和运算符以及隐式JOIN LATERAL .

Requires Postgres 9.3 for the JSON functions and operators and the implicit JOIN LATERAL.

json_array_elements() 只是取消嵌套json数组,而无需将结果转换为SQL行.相应地使用JSON运算符访问各个字段.

json_array_elements() just unnests the json array without transforming the result into a SQL row. Access individual fields with JSON operators accordingly.

SELECT array_agg(r.rep_id) AS ids, count(*) AS ct
FROM   reports r
     , json_array_elements(r.data->'objects') o
GROUP  BY r.data->>'background'
        , o->>'album'
        , o->>'scr'
ORDER  BY count(*) DESC
LIMIT  3;