且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

递归CTE在SQL Server中如何工作?

更新时间:2023-02-06 07:40:45

SQL Server中的递归CTE由两部分组成:



锚点:是递归的起点。此集合将通过递归联接进一步扩展。

 选择
EMPID,
FULLNAME,
MANAGERID,
1 AS ORGLEVEL

RECURSIVETBL
WHERE
MANAGERID为空

似乎正在获取所有没有经理的员工(将是高层领导者,还是来自树型关系的根源)。



递归:与 UNION ALL 链接时,该集合必须引用声明的CTE(因此使它递归)。认为它是如何扩展锚点的结果的下一个层次。

  UNION ALL 

选择
A.EMPID,
A.FULLNAME,
A.MANAGERID,
B. [ORGLEVEL] + 1

RECURSIVETBL A
JOIN RECURSIVECTE B –注意,我们引用的是 RECURSIVECTE,即CTE,我们在A.MANAGERID = B.EMPID
$ p中声明
$ p>

在此示例中,我们正在获取(第一次迭代)锚定结果集(所有没有经理的员工),并将它们与 RECURSIVETBL MANAGERID ,因此 A.EMPID 将容纳先前选择的经理的员工。只要每个最后一个结果集都可以生成新行,这种连接就会持续下去。



对递归部分的内容有一些限制(不分组)或其他嵌套的递归)。此外,由于它以 UNION ALL 开头,因此它的规则也适用(列的数量和数据类型必须匹配)。



关于 ORGLEVEL ,它以设置为1的锚点开始(在此处进行硬编码)。在递归集上进行进一步扩展时,它将获取上一个集(锚,在第一次迭代中),并添加1,因为它的表达式是 B. [ORGLEVEL] + 1 ,其中 B 是上一组。这意味着它以1(最高领导者)开头,并且为每个后代不断增加1,从而代表了组织的所有级别。



找到员工时在 ORGLEVEL = 3 的情况下,他在他上方有2位经理。






逐步介绍工作示例

让我们遵循以下示例:

  EmployeeID ManagerID 
1空
2 1
3 1
4 2
5 2
6 1
7 6
8 6
9空
10 3
11 3
12 10
13 9
14 9
15 13




  1. 锚点:员工没有管理员( ManagerID为NULL )。这将从公司的所有头号坏蛋开始。至关重要的是要注意,如果锚集为空,则整个递归CTE将为空,因为没有起点,也没有要加入的递归集。

      SELECT 
    EmployeeID = E.EmployeeID,
    ManagerID = NULL,-在WHERE过滤器中始终为空
    HierarchyLevel = 1,
    HierarchyRoute = CONVERT (VARCHAR(MAX),E.EmployeeID)
    来自
    员工为AS E
    WHERE
    E.ManagerID为NULL


其中有哪些:

  EmployeeID ManagerID HierarchyLevel HierarchyRoute 
1(空)1 1
9(空)1 9




  1. 递归N°1 :使用此 UNION ALL 递归:

      UNION ALL 

    SELECT
    EmployeeID = E.EmployeeID,
    ManagerID = E.ManagerID,
    HierarchyLevel = R.HierarchyLevel + 1,
    HierarchyRoute = R.HierarchyRoute +’-> '+ CONVERT(VARCHAR(10),E.EmployeeID)

    递归CTE AS R
    内加入员工E E ON R.EmployeeID = E.ManagerID


为此 INNER JOIN RecursiveCTE 有2行(锚集),员工ID为 1 9 。因此,此 JOIN 实际上将返回此结果。

  HierarchyLevel EmployeeID ManagerID HierarchyRoute 
2 2 1 1-> 2
2 3 1 1-> 3
2 6 1 1-> 6
2 13 9 9-> 13
2 14 9 9-> 14

查看 HierarchyRoute 如何从1开始和9并移动到每个后代?我们还将 HierarchyLevel 增加了1。



因为结果由 UNION ALL链接,此时我们得到以下结果(步骤1 + 2):

  HierarchyLevel EmployeeID ManagerID层次结构路由
1 1(空)1
1 9(空)9
2 2 1 1-> 2
2 3 1 1-> 3
2 6 1 1-> 6
2 13 9 9-> 13
2 14 9 9-> 14

这是棘手的部分,对于以下每个迭代,递归引用 RecursiveCTE 仅包含最后一个迭代结果集,而不包含累积的结果集。这意味着对于下一次迭代, RecursiveCTE 将代表以下行:

  HierarchyLevel EmployeeID ManagerID HierarchyRoute 
2 2 1 1-> 2
2 3 1 1-> 3
2 6 1 1-> 6
2 13 9 9-> 13
2 14 9 9-> 14




  1. 递归N °2 :遵循相同的递归表达式...

      UNION ALL 

    SELECT
    EmployeeID = E.EmployeeID,
    ManagerID = E.ManagerID,
    HierarchyLevel = R.HierarchyLevel + 1,
    HierarchyRoute = R.HierarchyRoute +'-> '+ CONVERT(VARCHAR(10),E.EmployeeID)

    递归CTE AS R
    内加入员工E E ON R.EmployeeID = E.ManagerID


并考虑到在此步骤中 RecursiveCTE 仅保存 HierarchyLevel = 2 的行,如果此JOIN为以下(第3级!),则返回结果:

  HierarchyLevel EmployeeID ManagerID HierarchyRoute 
3 4 2 1-> 2-> 4
3 5 2 1-> 2-> 5
3 7 6 1-> 6-> 7
3 8 6 1-> 6-> 8
3 10 3 1-> 3-> 10
3 11 3 1-> 3-> 11
3 15 13 9-> 13-> 15

此集合(仅此!)将在以下递归步骤中用作 RecursiveCTE ,它将被添加到累计总计中,现在是:

  HierarchyLevel EmployeeID ManagerID HierarchyRoute 
1 1(null)1
1 9(null)9
2 2 1 1-> 2
2 3 1 1-> 3
2 6 1 1-> 6
2 13 9 9-> 13
2 14 9 9-> 14
3 4 2 1-> 2-> 4
3 5 2 1-> 2-> 5
3 7 6 1-> 6-> 7
3 8 6 1-> 6-> 8
3 10 3 1-> 3-> 10
3 11 3 1-> 3-> 11
3 15 13 9-> 13-> 15




  1. 递归N °3 :从工作集中的3s级开始,加入的结果为:

      HierarchyLevel EmployeeID ManagerID层次路线
    4 12 10 1-> 3-> 10-> 12


这成为我们下一个递归的工作集


  1. 递归N°4 :从上一个唯一的行级别4开始步骤,联接的结果不产生任何行(没有雇员将EmployeeID 12作为ManagerID)。不返回任何行表示迭代结束。

最终结果集很高:

  HierarchyLevel EmployeeID ManagerID HierarchyRoute 
1 1(null)1
1 9(null)9
2 2 1 1-> 2
2 3 1 1-> 3
2 6 1 1-> 6
2 13 9 9-> 13
2 14 9 9-> 14
3 4 2 1-> 2-> 4
3 5 2 1-> 2-> 5
3 7 6 1-> 6-> 7
3 8 6 1-> 6-> 8
3 10 3 1-> 3-> 10
3 11 3 1-> 3-> 11
3 15 13 9-> 13-> 15
4 12 10 1-> 3-> 10-> 12

这是完整的小提琴和代码:

 创建表Employee(EmployeeID INT ,ManagerID INT)

插入员工(EmployeeID,ManagerID)

(1,NULL),
(2,1),
( 3,1),
(4,2),
(5,2),
(6,1),
(7,6),
( 8,6),
(9,NULL),
(10,3),
(11,3),
(12,10),
( 13,9),
(14,9),
(15,13)

RecursiveCTE AS

选择
EmployeeID = E.EmployeeID,
ManagerID = NULL,-通过WHERE过滤器始终为null
HierarchyLevel = 1,
HierarchyRoute = CONVERT(VARCHAR(MAX),E.EmployeeID)
FROM
员工AS E
WHERE
E.ManagerID为空

UNION ALL

SELECT
EmployeeID = E.EmployeeID,
ManagerID = E.Manag erID,
HierarchyLevel = R.HierarchyLevel + 1,
HierarchyRoute = R.HierarchyRoute +’-> '+ CONVERT(VARCHAR(10),E.EmployeeID)

递归CTE AS R
内加入员工E E ON R.EmployeeID = E.ManagerID

选择
R.HierarchyLevel,
R.EmployeeID,
R.ManagerID,
R.HierarchyRoute

递归CTE AS R
订购按
R.HierarchyLevel,
R.EmployeeID


Can anyone help me understand how this recursive CTE works?

WITH
RECURSIVECTE (EMPID, FULLNAME, MANAGERID, [ORGLEVEL]) AS
    (SELECT EMPID,
            FULLNAME,
            MANAGERID,
            1
     FROM RECURSIVETBL
     WHERE MANAGERID IS NULL
     UNION ALL
     SELECT A.EMPID,
            A.FULLNAME,
            A.MANAGERID,
            B.[ORGLEVEL] + 1
     FROM RECURSIVETBL A
          JOIN RECURSIVECTE B ON A.MANAGERID = B.EMPID)
SELECT *
FROM RECURSIVECTE;

Recursive CTEs in SQL Server have 2 parts:

The Anchor: Is the starting point of your recursion. It's a set that will be further expanded by recursive joins.

SELECT 
    EMPID,
    FULLNAME,
    MANAGERID,
    1 AS ORGLEVEL
FROM 
    RECURSIVETBL
WHERE 
    MANAGERID IS NULL

It seems that it's fetching all employees that don't have any managers (would be the top bosses, or roots from tree relationships).

The recursion: Linked with a UNION ALL, this set has to reference the declaring CTE (thus making it recursive). Think of it as how will you expand the result of the anchor with the next level.

UNION ALL

SELECT 
    A.EMPID,
    A.FULLNAME,
    A.MANAGERID,
    B.[ORGLEVEL] + 1
FROM 
    RECURSIVETBL A
    JOIN RECURSIVECTE B  -- Notice that we are referencing "RECURSIVECTE" which is the CTE we are declaring
    ON A.MANAGERID = B.EMPID

On this example, we are fetching (on a first iteration) the anchor result set (all employees with no managers) and joining them with RECURSIVETBL through MANAGERID, so A.EMPID will hold the employee of the previously selected manager. This joining goes on and on as long as each last result set can generate new rows.

There are a few limitations on what you can put on the recursive part (no grouping or another nested recursion for example). Also, as it's preceeded with a UNION ALL, it's rules also apply (amount of columns and data types must match).

About the ORGLEVEL, it starts with the anchor set at 1 (it's hard-coded there). When it's further expanded on the recursion set, it fetches the previous set (the anchor, on the first iteration) and it adds 1, as it's expression is B.[ORGLEVEL] + 1 with B being the previous set. This means that it starts with 1 (the top bosses) and it keeps adding 1 for each descendant, thus representing all the levels of the organization.

When you find an employee at ORGLEVEL = 3 means that he has 2 managers over him.


Step by step with working example

Let's follow this example:

EmployeeID  ManagerID
1           NULL
2           1
3           1
4           2
5           2
6           1
7           6
8           6
9           NULL
10          3
11          3
12          10
13          9
14          9
15          13

  1. Anchor: Employees without managers (ManagerID IS NULL). This will start with all the top badass of your company. It's crucial to note that if the anchor set is empty, then the whole recursive CTE will be empty, as there is no starting point and no recursive set to join to.

    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = NULL, -- Always null by WHERE filter
        HierarchyLevel = 1,
        HierarchyRoute = CONVERT(VARCHAR(MAX), E.EmployeeID)
    FROM
        Employee AS E
    WHERE
        E.ManagerID IS NULL
    

Which are these:

EmployeeID  ManagerID   HierarchyLevel  HierarchyRoute
1           (null)      1               1
9           (null)      1               9

  1. Recursion N°1: Using this UNION ALL recursion:

    UNION ALL
    
    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = E.ManagerID,
        HierarchyLevel = R.HierarchyLevel + 1,
        HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
    FROM
        RecursiveCTE AS R
        INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
    

For this INNER JOIN, RecursiveCTE has 2 rows (the anchor set), with employees ID 1 and 9. So this JOIN will actually return this result.

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14

See how the HierarchyRoute starts with 1 and 9 and moves to each descendant? We also increased HierarchyLevel by 1.

Because the results are linked by a UNION ALL, at this point we have the following results (step 1 + 2):

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
1               1           (null)      1
1               9           (null)      9
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14

Here is the tricky part, for each of the following iterations, recursive references to RecursiveCTE will only contain the last iteration result set, and not the accumulated set. This means that for the next iteration, RecursiveCTE will represent these rows:

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14

  1. Recursion N°2: Following the same recursive expression...

    UNION ALL
    
    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = E.ManagerID,
        HierarchyLevel = R.HierarchyLevel + 1,
        HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
    FROM
        RecursiveCTE AS R
        INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
    

And considering that in this step RecursiveCTE only holds rows with HierarchyLevel = 2, then the result if this JOIN is the following (level 3!):

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
3               4           2           1 -> 2 -> 4
3               5           2           1 -> 2 -> 5
3               7           6           1 -> 6 -> 7
3               8           6           1 -> 6 -> 8
3               10          3           1 -> 3 -> 10
3               11          3           1 -> 3 -> 11
3               15          13          9 -> 13 -> 15

This set (and only this!) will be used in the following recursive step as RecursiveCTE, and it will be added to the accumulated grand total, which is now:

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
1               1           (null)      1
1               9           (null)      9
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14
3               4           2           1 -> 2 -> 4
3               5           2           1 -> 2 -> 5
3               7           6           1 -> 6 -> 7
3               8           6           1 -> 6 -> 8
3               10          3           1 -> 3 -> 10
3               11          3           1 -> 3 -> 11
3               15          13          9 -> 13 -> 15

  1. Recursion N°3: Starting with level 3s in our working set, the result of the join is:

    HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
    4               12          10          1 -> 3 -> 10 -> 12
    

This becomes our working set for next recursive step.

  1. Recursion N°4: Starting with the only row level 4 from previous step, the result of the join yields no rows (no employee has EmployeeID 12 as ManagerID). Returning no rows marks the end of the iterations.

The final result set stands tall:

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
1               1           (null)      1
1               9           (null)      9
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14
3               4           2           1 -> 2 -> 4
3               5           2           1 -> 2 -> 5
3               7           6           1 -> 6 -> 7
3               8           6           1 -> 6 -> 8
3               10          3           1 -> 3 -> 10
3               11          3           1 -> 3 -> 11
3               15          13          9 -> 13 -> 15
4               12          10          1 -> 3 -> 10 -> 12

Here is the complete fiddle and code:

CREATE TABLE Employee (EmployeeID INT, ManagerID INT)

INSERT INTO Employee (EmployeeID, ManagerID)
VALUES
  (1, NULL),
  (2, 1),
  (3, 1),
  (4, 2),
  (5, 2),
  (6, 1),
  (7, 6),
  (8, 6),
  (9, NULL),
  (10, 3),
  (11, 3),
  (12, 10),
  (13, 9),
  (14, 9),
  (15, 13)

WITH RecursiveCTE AS
(
    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = NULL, -- Always null by WHERE filter
        HierarchyLevel = 1,
        HierarchyRoute = CONVERT(VARCHAR(MAX), E.EmployeeID)
    FROM
        Employee AS E
    WHERE
        E.ManagerID IS NULL

    UNION ALL

    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = E.ManagerID,
        HierarchyLevel = R.HierarchyLevel + 1,
        HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
    FROM
        RecursiveCTE AS R
        INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
)
SELECT
    R.HierarchyLevel,
    R.EmployeeID,
    R.ManagerID,
    R.HierarchyRoute
FROM
    RecursiveCTE AS R
ORDER BY
    R.HierarchyLevel,
    R.EmployeeID