将同一表的多个列上的条件计数相加

更新时间：2022-12-12 14:46:31

查询

查询并不像初看起来那样简单.最短的查询字符串不一定会产生***性能.这应该尽快完成，为此应尽可能短:

SELECT p.username, COALESCE(w.ct, 0) AS won, COALESCE(l.ct, 0) AS lost
FROM  (
   SELECT loser_id AS player_id, count(*) AS ct
   FROM   match
   WHERE  winner_id = 1  -- your player_id here
   GROUP  BY 1           -- positional reference (not your player_id)
   ) w
FULL JOIN (
   SELECT winner_id AS player_id, count(*) AS ct
   FROM   match
   WHERE  loser_id = 1   -- your player_id here
   GROUP  BY 1
   ) l USING (player_id)
JOIN   player p USING (player_id)
ORDER  BY 1;

结果完全符合要求:

username | won | lost
---------+-----+-----
alice    | 3   | 2
bob      | 1   | 0
mary     | 2   | 1

SQL提琴 -具有更多可揭示的测试数据！

关键功能是 FULL [OUTER] JOIN 两个子查询之间的"strong> "(输赢).这将产生一个表格，其中列出了我们的候选人所针对的所有玩家.连接条件中的USING子句可以方便地将两个player_id列合并为一个.

此后，用一个JOIN到player来获取名称，然后仅索引扫描)出于这一点.这样，Postgres甚至根本不会访问match表，您就会获得超快的结果.

在两列integer列中，您碰巧遇到了局部最优值:这些索引的大小与简单索引的大小相同.详细信息:

复合索引对第一字段的查询也有用吗?

更恐怖，但是很慢

您可以运行相关的子查询，例如 @Giorgi建议，只需正确地运行 :>

SELECT *
FROM  (
   SELECT username
       , (SELECT count(*) FROM match
          WHERE  loser_id  = p.player_id
          AND    winner_id = 1) AS won
       , (SELECT count(*) FROM match
          WHERE  winner_id = p.player_id
          AND    loser_id  = 1) AS lost
   FROM   player p
   WHERE  player_id <> 1
   ) sub
WHERE (won > 0 OR lost > 0)
ORDER  BY username;

适用于小型表，但不能扩展.这需要每个现有播放器在player上进行顺序扫描，并在match上进行两次索引扫描.用EXPLAIN ANALYZE比较性能.

I am looking for a "better" way to perform a query in which I want to show a single player who he has played previously and the associated win-loss record for each such opponent.

Here are the tables involved stripped down to essentials:

create table player (player_id int, username text);
create table match (winner_id int, loser_id int);

insert into player values (1, 'john'), (2, 'mary'), (3, 'bob'), (4, 'alice');
insert into match values (1, 2), (1, 2), (1, 3), (1, 4), (1, 4), (1, 4)
                       , (2, 1), (4, 1), (4, 1);

Thus, john has a record of 2 wins and 1 loss vs mary; 1 win and 0 losses vs bob; and 3 wins and 2 losses vs alice.

create index idx_winners on match(winner_id);
create index idx_winners on match(loser_id);

I am using Postgres 9.4. Something in the back of my head tells me to consider LATERAL somehow but I'm having a hard time understanding the "shape" of such.

The following is the query I am using currently but something "feels off". Please help me learn and improve this.

select p.username as opponent, 
       coalesce(r.won, 0) as won, 
       coalesce(r.lost, 0) as lost
from (
    select m.winner_id, m.loser_id, count(m.*) as won, (
        select t.lost
        from (
            select winner_id, loser_id, count(*) as lost
            from match
            where loser_id = m.winner_id
            and winner_id = m.loser_id
            group by winner_id, loser_id
        ) t 
    )   
    from match m
    where m.winner_id = 1   -- this would be a parameter
    group by m.winner_id, m.loser_id
) r 
join player p on p.player_id = r.loser_id;

This works as expected. Just looking to learn some tricks or better yet proper techniques to do the same.

opponent  won  lost
--------  ---  ----
alice     3    2
bob       1    0
mary      2    1

Query

The query is not as simple as it looks at first. The shortest query string does not necessarily yield best performance. This should be as fast as it gets, being as short as possible for that:

SELECT p.username, COALESCE(w.ct, 0) AS won, COALESCE(l.ct, 0) AS lost
FROM  (
   SELECT loser_id AS player_id, count(*) AS ct
   FROM   match
   WHERE  winner_id = 1  -- your player_id here
   GROUP  BY 1           -- positional reference (not your player_id)
   ) w
FULL JOIN (
   SELECT winner_id AS player_id, count(*) AS ct
   FROM   match
   WHERE  loser_id = 1   -- your player_id here
   GROUP  BY 1
   ) l USING (player_id)
JOIN   player p USING (player_id)
ORDER  BY 1;

Result exactly as requested:

username | won | lost
---------+-----+-----
alice    | 3   | 2
bob      | 1   | 0
mary     | 2   | 1

SQL Fiddle - with more revealing test data!

The key feature is the FULL [OUTER] JOIN between the two subqueries for losses and wins. This produces a table of all players our candidate has played against. The USING clause in the join condition conveniently merges the two player_id columns into one.

After that, a single JOIN to player to get the name, and COALESCE to replace NULL with 0. Voilá.

Index

Would be even faster with two multicolumn indexes:

CREATE INDEX idx_winner on match (winner_id, loser_id);
CREATE INDEX idx_loser  on match (loser_id, winner_id);

Only if you get index-only scans out of this. Then Postgres does not even visit the match table at all and you get super-fast results.

With two integer columns you happen to hit a local optimum: theses indexes have just the same size as the simple ones you had. Details:

Is a composite index also good for queries on the first field?

Shorter, but slow

You could run correlated subqueries like @Giorgi suggested, just working correctly:

SELECT *
FROM  (
   SELECT username
       , (SELECT count(*) FROM match
          WHERE  loser_id  = p.player_id
          AND    winner_id = 1) AS won
       , (SELECT count(*) FROM match
          WHERE  winner_id = p.player_id
          AND    loser_id  = 1) AS lost
   FROM   player p
   WHERE  player_id <> 1
   ) sub
WHERE (won > 0 OR lost > 0)
ORDER  BY username;

Works fine for small tables, but doesn't scale. This needs a sequential scan on player and two index scans on match per existing player. Compare performance with EXPLAIN ANALYZE.

上一篇 : ：MySQL AUTO_INCREMENT不会回滚下一篇 : 将同一数据表中的几组列相乘

将同一表的多个列上的条件计数相加

查询

更恐怖，但是很慢

Query

Index

Shorter, but slow

相关阅读

技术问答最新文章