且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

基于GROUP BY结果的多个自连接

更新时间:2023-10-02 11:56:04

You apparently have two intermediate inner join output tables and you want to get columns from each about some things identified by a common key. So inner join them on the key.

select
  g.client_name,
  g.dataset,
  g.plugin_name,
  LastGood,
  g.status_code,
  LastGood_bytes
  LastAttempt,
  l.status_code,
  LastAttempt_bytes
from
( -- cut & pasted Last Good http://sqlfiddle.com/#!15/f15556/16
    select
      a2.client_name,
      a2.dataset,
      a2.plugin_name,
      a2.LastGood,
      a3.status_code,
      a3.bytes_modified as LastGood_bytes
    from v_activities_2 a3
    join (
      select
        client_name,
        dataset,
        plugin_name,
        max(completed_ts) as LastGood
      from v_activities_2 a2
      where
        type like '%Backup%'
        and status_code in (30000,30005)   -- Successful (Good) Status codes
      group by
        client_name, dataset, plugin_name
    ) as a2
    on a3.client_name  = a2.client_name and
       a3.dataset      = a2.dataset and
       a3.plugin_name  = a2.plugin_name and
       a3.completed_ts = a2.LastGood
) as g
join 
( -- cut & pasted Last Attempt http://sqlfiddle.com/#!15/f15556/3
    select
      a1.client_name,
      a1.dataset,
      a1.plugin_name,
      a1.LastAttempt,
      a3.status_code,
      a3.bytes_modified as LastAttempt_bytes
    from v_activities_2 a3
    join (
      select
        client_name,
        dataset,
        plugin_name,
        max(completed_ts) as LastAttempt
      from v_activities_2 a2
      where
        type like '%Backup%'
      group by
        client_name, dataset, plugin_name
    ) as a1
    on a3.client_name  = a1.client_name and
       a3.dataset      = a1.dataset and
       a3.plugin_name  = a1.plugin_name and
       a3.completed_ts = a1.LastAttempt
) as l
on l.client_name  = g.client_name and
   l.dataset      = g.dataset and
   l.plugin_name  = g.plugin_name
order by client_name, dataset, plugin_name

This uses one of the applicable approaches in Strange duplicate behavior from GROUP_CONCAT of two LEFT JOINs of GROUP_BYs. However the correspondence of chunks of code might not be so clear. Its intermediate are left vs your inner & group_concat is your max. (But it has more approaches because of particulars of group_concat & its query.)

A correct symmetrical INNER JOIN approach: LEFT JOIN q1 & q2--1:many--then GROUP BY & GROUP_CONCAT (which is what your first query did); then separately similarly LEFT JOIN q1 & q3--1:many--then GROUP BY & GROUP_CONCAT; then INNER JOIN the two results ON user_id--1:1.

A correct cumulative LEFT JOIN approach: JOIN q1 & q2--1:many--then GROUP BY & GROUP_CONCAT; then left join that & q3--1:many--then GROUP BY & GROUP_CONCAT.

Whether this actually serves your purpose in general depends on your actual specification and constraints. Even if the two joins you link are what you want you need to explain exactly what you mean by "merge". You don't say what you want if the joins have different sets of values for the grouped columns. Force yourself to use the English language to say what rows go in the result based on what rows are in the input.

PS 1 You have undocumented/undeclared/unenforced constraints. Please declare when possible. Otherwise enforce by triggers. Document in question text if not in code. Constraints are fundamental to multiple subrow value instances in join & to group by.

PS 2 Learn the syntax/semantics for select. Learn what left/right outer join ons return--whatinner join on does plus unmatched left/right table rows extended by nulls.

PS 3 Is there any rule of thumb to construct SQL query from a human-readable description?