更新时间:2022-12-12 14:46:31
查询并不像初看起来那样简单.最短的查询字符串不一定会产生***性能.这应该尽快完成,为此应尽可能短:
SELECT p.username, COALESCE(w.ct, 0) AS won, COALESCE(l.ct, 0) AS lost
FROM (
SELECT loser_id AS player_id, count(*) AS ct
FROM match
WHERE winner_id = 1 -- your player_id here
GROUP BY 1 -- positional reference (not your player_id)
) w
FULL JOIN (
SELECT winner_id AS player_id, count(*) AS ct
FROM match
WHERE loser_id = 1 -- your player_id here
GROUP BY 1
) l USING (player_id)
JOIN player p USING (player_id)
ORDER BY 1;
结果完全符合要求:
username | won | lost
---------+-----+-----
alice | 3 | 2
bob | 1 | 0
mary | 2 | 1
SQL提琴 -具有更多可揭示的测试数据!
关键功能是 FULL [OUTER] JOIN
两个子查询之间的"strong> "(输赢).这将产生一个表格,其中列出了我们的候选人所针对的所有玩家.连接条件中的USING
子句可以方便地将两个player_id
列合并为一个.
此后,用一个JOIN
到player
来获取名称,然后仅索引扫描)出于这一点.这样,Postgres甚至根本不会访问match
表 ,您就会获得超快的结果.
在两列integer
列中,您碰巧遇到了局部最优值:这些索引的大小与简单索引的大小相同.详细信息:
您可以运行相关的子查询,例如 @Giorgi建议,只需正确地运行 :>
SELECT *
FROM (
SELECT username
, (SELECT count(*) FROM match
WHERE loser_id = p.player_id
AND winner_id = 1) AS won
, (SELECT count(*) FROM match
WHERE winner_id = p.player_id
AND loser_id = 1) AS lost
FROM player p
WHERE player_id <> 1
) sub
WHERE (won > 0 OR lost > 0)
ORDER BY username;
适用于小型表,但不能扩展.这需要每个现有播放器在player
上进行顺序扫描,并在match
上进行两次索引扫描.用EXPLAIN ANALYZE
比较性能.
I am looking for a "better" way to perform a query in which I want to show a single player who he has played previously and the associated win-loss record for each such opponent.
Here are the tables involved stripped down to essentials:
create table player (player_id int, username text);
create table match (winner_id int, loser_id int);
insert into player values (1, 'john'), (2, 'mary'), (3, 'bob'), (4, 'alice');
insert into match values (1, 2), (1, 2), (1, 3), (1, 4), (1, 4), (1, 4)
, (2, 1), (4, 1), (4, 1);
Thus, john has a record of 2 wins and 1 loss vs mary; 1 win and 0 losses vs bob; and 3 wins and 2 losses vs alice.
create index idx_winners on match(winner_id);
create index idx_winners on match(loser_id);
I am using Postgres 9.4. Something in the back of my head tells me to consider LATERAL
somehow but I'm having a hard time understanding the "shape" of such.
The following is the query I am using currently but something "feels off". Please help me learn and improve this.
select p.username as opponent,
coalesce(r.won, 0) as won,
coalesce(r.lost, 0) as lost
from (
select m.winner_id, m.loser_id, count(m.*) as won, (
select t.lost
from (
select winner_id, loser_id, count(*) as lost
from match
where loser_id = m.winner_id
and winner_id = m.loser_id
group by winner_id, loser_id
) t
)
from match m
where m.winner_id = 1 -- this would be a parameter
group by m.winner_id, m.loser_id
) r
join player p on p.player_id = r.loser_id;
This works as expected. Just looking to learn some tricks or better yet proper techniques to do the same.
opponent won lost
-------- --- ----
alice 3 2
bob 1 0
mary 2 1
The query is not as simple as it looks at first. The shortest query string does not necessarily yield best performance. This should be as fast as it gets, being as short as possible for that:
SELECT p.username, COALESCE(w.ct, 0) AS won, COALESCE(l.ct, 0) AS lost
FROM (
SELECT loser_id AS player_id, count(*) AS ct
FROM match
WHERE winner_id = 1 -- your player_id here
GROUP BY 1 -- positional reference (not your player_id)
) w
FULL JOIN (
SELECT winner_id AS player_id, count(*) AS ct
FROM match
WHERE loser_id = 1 -- your player_id here
GROUP BY 1
) l USING (player_id)
JOIN player p USING (player_id)
ORDER BY 1;
Result exactly as requested:
username | won | lost
---------+-----+-----
alice | 3 | 2
bob | 1 | 0
mary | 2 | 1
SQL Fiddle - with more revealing test data!
The key feature is the FULL [OUTER] JOIN
between the two subqueries for losses and wins. This produces a table of all players our candidate has played against. The USING
clause in the join condition conveniently merges the two player_id
columns into one.
After that, a single JOIN
to player
to get the name, and COALESCE
to replace NULL with 0. Voilá.
Would be even faster with two multicolumn indexes:
CREATE INDEX idx_winner on match (winner_id, loser_id);
CREATE INDEX idx_loser on match (loser_id, winner_id);
Only if you get index-only scans out of this. Then Postgres does not even visit the match
table at all and you get super-fast results.
With two integer
columns you happen to hit a local optimum: theses indexes have just the same size as the simple ones you had. Details:
You could run correlated subqueries like @Giorgi suggested, just working correctly:
SELECT *
FROM (
SELECT username
, (SELECT count(*) FROM match
WHERE loser_id = p.player_id
AND winner_id = 1) AS won
, (SELECT count(*) FROM match
WHERE winner_id = p.player_id
AND loser_id = 1) AS lost
FROM player p
WHERE player_id <> 1
) sub
WHERE (won > 0 OR lost > 0)
ORDER BY username;
Works fine for small tables, but doesn't scale. This needs a sequential scan on player
and two index scans on match
per existing player. Compare performance with EXPLAIN ANALYZE
.