更新时间:2022-10-17 22:49:41
由于您的范围是连续的,因此问题本质上变成了 间隙和岛屿 一.如果您有一个标准来帮助您区分具有相同 t
值的不同序列,您可以使用该标准对所有行进行分组,然后只需取 MIN(s), MAX(e)
每个组.
获得此类标准的一种方法是使用两个 ROW_NUMBER
调用.考虑以下查询:
SELECT*,rnk1 = ROW_NUMBER() OVER(ORDER BY s),rnk2 = ROW_NUMBER() OVER(PARTITION BY t ORDER BY s)发件人@句号;
对于您的示例,它将返回以下集合:
s e t rnk1 rnk2---------- ---------- -- -- ---- ----2013-01-01 2013-01-02 3 1 12013-01-02 2013-01-04 1 2 12013-01-04 2013-01-05 1 3 22013-01-05 2013-01-06 2 4 12013-01-06 2013-01-07 2 5 22013-01-07 2013-01-08 2 6 32013-01-08 2013-01-09 1 7 3
rnk1
和 rnk2
排名的有趣之处在于,如果您从另一个中减去一个,您将获得与 t
,唯一标识每个具有相同t
的不同行序列:
s e t rnk1 rnk2 rnk1 - rnk2---------- ---------- -- ---- ---- ------------2013-01-01 2013-01-02 3 1 1 02013-01-02 2013-01-04 1 2 1 12013-01-04 2013-01-05 1 3 2 12013-01-05 2013-01-06 2 4 1 32013-01-06 2013-01-07 2 5 2 32013-01-07 2013-01-08 2 6 3 32013-01-08 2013-01-09 1 7 3 4
知道了这一点,您可以轻松地应用分组和聚合.这是最终查询的样子:
WITH 分区 AS (选择*,g = ROW_NUMBER() OVER ( ORDER BY s)- ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)发件人@句号)选择s = MIN(s),e = MAX(e),吨从分区通过...分组吨,G;
如果您愿意,可以在 在 SQL Fiddle 使用此解决方案.>
Say we have such a table:
declare @periods table (
s date,
e date,
t tinyint
);
with date intervals without gaps ordered by start date (s)
insert into @periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
All date intervals have different types (t).
It is required to combine date intervals of the same type where they are not broken by intervals of the other types (having all intervals ordered by start date).
So the result table should look like:
s | e | t
------------|------------|-----
2013-01-01 | 2013-01-02 | 3
2013-01-02 | 2013-01-05 | 1
2013-01-05 | 2013-01-08 | 2
2013-01-08 | 2013-01-09 | 1
Any ideas how to do this without cursor?
I've got one working solution:
declare @periods table (
s datetime primary key clustered,
e datetime,
t tinyint,
period_number int
);
insert into @periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
declare @t tinyint = null;
declare @PeriodNumber int = 0;
declare @anchor date;
update @periods
set period_number = @PeriodNumber,
@PeriodNumber = case
when @t <> t
then @PeriodNumber + 1
else
@PeriodNumber
end,
@t = t,
@anchor = s
option (maxdop 1);
select
s = min(s),
e = max(e),
t = min(t)
from
@periods
group by
period_number
order by
s;
but I doubt if I can rely on such a behavior of UPDATE statement?
I use SQL Server 2008 R2.
Edit:
Thanks to Daniel and this article: http://www.sqlservercentral.com/articles/T-SQL/68467/
I found three important things that were missed in the solution above:
I've changed the above solution in accordance with these rules.
Since your ranges are continuous, the problem essentially becomes a gaps-and-islands one. If only you had a criterion to help you to distinguish between different sequences with the same t
value, you could group all the rows using that criterion, then just take MIN(s), MAX(e)
for every group.
One method of obtaining such a criterion is to use two ROW_NUMBER
calls. Consider the following query:
SELECT
*,
rnk1 = ROW_NUMBER() OVER ( ORDER BY s),
rnk2 = ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM @periods
;
For your example it would return the following set:
s e t rnk1 rnk2
---------- ---------- -- ---- ----
2013-01-01 2013-01-02 3 1 1
2013-01-02 2013-01-04 1 2 1
2013-01-04 2013-01-05 1 3 2
2013-01-05 2013-01-06 2 4 1
2013-01-06 2013-01-07 2 5 2
2013-01-07 2013-01-08 2 6 3
2013-01-08 2013-01-09 1 7 3
The interesting thing about the rnk1
and rnk2
rankings is that if you subtract one from the other, you will get values that, together with t
, uniquely identify every distinct sequence of rows with the same t
:
s e t rnk1 rnk2 rnk1 - rnk2
---------- ---------- -- ---- ---- -----------
2013-01-01 2013-01-02 3 1 1 0
2013-01-02 2013-01-04 1 2 1 1
2013-01-04 2013-01-05 1 3 2 1
2013-01-05 2013-01-06 2 4 1 3
2013-01-06 2013-01-07 2 5 2 3
2013-01-07 2013-01-08 2 6 3 3
2013-01-08 2013-01-09 1 7 3 4
Knowing that, you can easily apply grouping and aggregation. This is what the final query might look like:
WITH partitioned AS (
SELECT
*,
g = ROW_NUMBER() OVER ( ORDER BY s)
- ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM @periods
)
SELECT
s = MIN(s),
e = MAX(e),
t
FROM partitioned
GROUP BY
t,
g
;
If you like, you can play with this solution at SQL Fiddle.