Think of it as a staged set of queries. The first query must not be self-referential, since it establishes the base result set used for recursion. The second query then is free to use the results of the first part of the union recursively. The second query then sends its result to the second part recursively (excluding the results it was sent). The recursion ends when there are no more results from the query - in this case when you hit the bottom of the tree where an employee is not a manager of anyone below them.
Taking a simple set of data, we might have something like:
INSERT INTO Employees(empid, empname, mgrid) VALUES(7, "Chris", NULL)
INSERT INTO Employees(empid, empname, mgrid) VALUES(8, "Hank", 7)
INSERT INTO Employees(empid, empname, mgrid) VALUES(9, "Fred", 8)
The Anchor in the above query returns a single result:
(empid = 7, empname = "Chris", mgrid = NULL, lvl = 0)
Then this is fed into the join of the second query, returning another row:
(empid = 8, empname = "Hank", mgrid = 7, lvl = 1)
Which is fed into the recursion and then returns:
(empid = 9, empname = "Fred", mgrid = 8, lvl = 2)
Which then feeds into the final recursion where an empty set is returned:
(no results)
The final result set returned by the query would be the union of all the results:
(empid = 7, empname = "Chris", mgrid = NULL, lvl = 0)
(empid = 8, empname = "Hank", mgrid = 7, lvl = 1)
(empid = 9, empname = "Fred", mgrid = 8, lvl = 2)
As I said, it's not very consistent with any other SQL syntax that exists in T-SQL or PL/SQL.
|