且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

仅在SQL的列中选择重复值的第一行

更新时间:2022-12-09 23:34:55

您可以使用EXISTS半联接来识别候选者:

You can use a EXISTS semi-join to identify candidates:

SELECT * FROM tbl
WHERE NOT EXISTS (
    SELECT *
    FROM tbl t
    WHERE t.col1 = tbl.col1
    AND t.id = tbl.id - 1
    )
ORDER BY id

摆脱不必要的行:

DELETE FROM tbl
-- SELECT * FROM tbl
WHERE EXISTS (
    SELECT *
    FROM   tbl t
    WHERE  t.col1 = tbl.col1
    AND    t.id   = tbl.id - 1
    )

这将有效地删除每一行,而前一行在col1中具有相同的值,从而达到您设定的目标:每个突发的第一行都将保留.

This effectively deletes every row, where the preceding row has the same value in col1, thereby arriving at your set goal: only the first row of every burst survives.

我留下了注释后的SELECT声明,因为您应该始终在执行操作之前先检查要删除的内容.

I left the commented SELECT statement because you should always check what is going to be deleted before you do the deed.

如果您的RDBMS支持 CTE

If your RDBMS supports the CTE and window functions (like PostgreSQL, Oracle, SQL Server, ... but not SQLite, MS Access or MySQL), there is an elegant way:

WITH x AS (
    SELECT *, row_number() OVER (ORDER BY id) AS rn
    FROM tbl
    )
SELECT id, col1
FROM   x
WHERE NOT EXISTS (
    SELECT *
    FROM   x x1
    WHERE  x1.col1 = x.col1
    AND    x1.rn   = x.rn - 1
    )
ORDER BY id;

还有一种不太优雅的方式来完成这项工作没有这些细微之处.
应该为您工作:

There is also the not-so-elegant way that does the job without those niceties.
Should work for you:

SELECT id, col1
FROM   tbl
WHERE (
    SELECT t.col1 = tbl.col1
    FROM   tbl AS t
    WHERE  t.id < tbl.id
    ORDER  BY id DESC
    LIMIT  1) IS NOT TRUE
ORDER BY id


测试外壳非顺序ID的工具

(在PostgreSQL中测试)


Tool for test-casing non-sequential IDs

(Tested in PostgreSQL)

CREATE TEMP TABLE tbl (id int, col1 int);
INSERT INTO tbl VALUES
 (1,6050000),(2,6050000),(6,6050000)
,(14,6060000),(15,6060000),(16,6060000)
,(17,6060000),(18,6060000),(19,6050000)
,(20,6000000),(111,6000000);