且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

找到具有以相同的最后四位数字结尾的值的重复列 - SQL

更新时间:2023-02-10 18:53:45

我只想延长Sick答案。

I just want extend Sick answer.

你说你会选择哪一个消除。但您也可以在 ORDER clausule中添加一个 CASE statment来过滤您想要消除的错误。

You say you will like to choose which one to eliminate. But you can also include a ORDER clausule with a CASE statment to filter the one you want eliminate.

在这种情况下,我通过name订购,以便您可以使用 rn> 1 并保持名字。

In this case I order by "name" so you can delete all with rn > 1 and keep the first name.

SqlFiddleDemo

SqlFiddleDemo

select "person_number", "name", rn, zero_count
from
(
  select "person_number", 
         "name", 
         substr("person_number", 1, 1),
         count(1) over (partition by substr("person_number",-4)) as Cnt,
         SUM(case 
             when substr("person_number", 1, 1)  = '0' then 1
             else 0 end) over (partition by substr("person_number",-4)) as zero_count,
          row_number() over (partition by substr("person_number",-4) order by "name") as rn
  from person
)
Where Cnt > 1
and zero_count > 0
ORDER BY substr("person_number",-4)

我增加数据样本


  • 现在包含一个字段 zero_count 来计算以$开头的行数每个组中的$ c> 0

  • 这两个行的最后都有相同的4个字符,并且还以0开头( ZERO_COUNT = 2

  • case没有匹配的行也从0开始

  • now include a field zero_count to calculate how many rows start with 0 in each group
  • case where both row have same 4 char at the end and also start with 0 (ZERO_COUNT = 2)
  • case when a row without match also start with 0