且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

PostgreSQL GROUP BY 与 MySQL 不同?

更新时间:2023-11-26 22:14:46

MySQL 完全不符合标准的 GROUP BY 可以被 Postgres 的 DISTINCT ON 模拟.考虑一下:

MySQL:

SELECT a,b,c,d,e FROM table GROUP BY a

这会为 a 的每个值提供 1 行(您不知道是哪一个).实际上你可以猜到,因为 MySQL 不知道哈希聚合,所以它可能会使用排序......但它只会对 a 进行排序,所以行的顺序可能是随机的.除非它使用多列索引而不是排序.好吧,无论如何,它不是由查询指定的.

Postgres:

SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c

这会为 a 的每个值提供 1 行,该行将是根据查询指定的 ORDER BY 排序的第一行.简单.

请注意,这里不是我正在计算的聚合.所以 GROUP BY 实际上没有意义.DISTINCT ON 更有意义.

Rails 与 MySQL 结合,所以我对它生成的 SQL 在 Postgres 中不起作用并不感到惊讶.

I've been migrating some of my MySQL queries to PostgreSQL to use Heroku. Most of my queries work fine, but I keep having a similar recurring error when I use group by:

ERROR: column "XYZ" must appear in the GROUP BY clause or be used in an aggregate function

Could someone tell me what I'm doing wrong?


MySQL which works 100%:

SELECT `availables`.*
FROM `availables`
INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
GROUP BY availables.bookdate
ORDER BY availables.updated_at


PostgreSQL error:

ActiveRecord::StatementInvalid: PGError: ERROR: column "availables.id" must appear in the GROUP BY clause or be used in an aggregate function:
SELECT "availables".* FROM "availables" INNER JOIN "rooms" ON "rooms".id = "availables".room_id WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN E'2009-10-21' AND E'2009-10-23') GROUP BY availables.bookdate ORDER BY availables.updated_at


Ruby code generating the SQL:

expiration = Available.find(:all,
    :joins => [ :room ],
    :conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
    :group => 'availables.bookdate',
    :order => 'availables.updated_at')  


Expected Output (from working MySQL query):

+-----+-------+-------+------------+---------+---------------+---------------+
| id  | price | spots | bookdate   | room_id | created_at    | updated_at    |
+-----+-------+-------+------------+---------+---------------+---------------+
| 414 | 38.0  | 1     | 2009-11-22 | 1762    | 2009-11-20... | 2009-11-20... |
| 415 | 38.0  | 1     | 2009-11-23 | 1762    | 2009-11-20... | 2009-11-20... |
| 416 | 38.0  | 2     | 2009-11-24 | 1762    | 2009-11-20... | 2009-11-20... |
+-----+-------+-------+------------+---------+---------------+---------------+
3 rows in set

MySQL's totally non standards compliant GROUP BY can be emulated by Postgres' DISTINCT ON. Consider this:

MySQL:

SELECT a,b,c,d,e FROM table GROUP BY a

This delivers 1 row per value of a (which one, you don't really know). Well actually you can guess, because MySQL doesn't know about hash aggregates, so it will probably use a sort... but it will only sort on a, so the order of the rows could be random. Unless it uses a multicolumn index instead of sorting. Well, anyway, it's not specified by the query.

Postgres:

SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c

This delivers 1 row per value of a, this row will be the first one in the sort according to the ORDER BY specified by the query. Simple.

Note that here, it's not an aggregate I'm computing. So GROUP BY actually makes no sense. DISTINCT ON makes a lot more sense.

Rails is married to MySQL, so I'm not surprised that it generates SQL that doesn't work in Postgres.