且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

MySQL - 控制组返回哪一行

更新时间:2023-02-13 20:18:49

这叫做选择列的分组最大值.这里有几种不同的 mysql 方法.

It's called selecting the group-wise maximum of a column. Here are several different approaches for mysql.

这是我的做法:

SELECT *
FROM (SELECT id, max(version_id) as version_id FROM table GROUP BY id) t1
INNER JOIN table t2 on t2.id=t1.id and t1.version_id=t2.version_id

虽然mysql会在内存中为子查询创建一个临时表,但这样会比较有效.我假设您已经有了这个表的 (id, version_id) 索引.

This will be relatively efficient, though mysql will create a temporary table in memory for the subquery. I assume you already have an index on (id, version_id) for this table.

对于此类问题,您或多或少必须使用子查询是 SQL 的一个缺陷(半连接是另一个例子).

It's a deficiency in SQL that you more or less have to use a subquery for this type of problem (semi-joins are another example).

子查询在 mysql 中没有得到很好的优化,但不相关的子查询并没有那么糟糕,只要它们不是太大以至于它们会被写入磁盘而不是内存.鉴于在此查询中只有两个整数,子查询可能在此之前很久就有数百万行,但您的第一个查询中的 select * 子查询可能会更快地遇到此问题.

Subqueries are not well optimized in mysql but uncorrelated subqueries aren't so bad as long as they aren't so enormous that they will get written to disk rather than memory. Given that in this query only has two ints the subquery could be millions of rows long before that happened but the select * subquery in your first query could suffer from this problem much sooner.