Lambda the Ultimate

inactiveTopic SQL Server "Yukon" Beta 1 Transact-SQL Enhancements
started 11/7/2003; 1:38:51 PM - last post 11/10/2003; 1:23:11 PM
Chris Rathman - SQL Server "Yukon" Beta 1 Transact-SQL Enhancements  blueArrow
11/7/2003; 1:38:51 PM (reads: 7656, responses: 6)
SQL Server "Yukon" Beta 1 Transact-SQL Enhancements
Isaac's post about the MS PDC got me to reading about the proposed T-SQL Enhancements in SQL Server "Yukon". One change that caught my eye was the Common Table Expression (CTE) and the ability to define recursive queries. Recursive queries could come in handy when the data structure is in the form of a hierarchical Tree. The syntax for a CTE seems to be a lifted LET sort of construction:

WITH EmpCTE(empid, empname, mgrid, lvl) AS (
   -- Anchor Member (AM)
   SELECT empid, empname, mgrid, 0
   FROM Employees
   WHERE empid = 7

   UNION ALL

   -- Recursive Member (RM)
   SELECT E.empid, E.empname, E.mgrid, M.lvl+1
   FROM Employees AS E
   JOIN EmpCTE AS M
   ON E.mgrid = M.empid
)
SELECT * FROM EmpCTE

I guess the usual rules when any vendor enhances SQL applies: (1). Never make it similar to the enhancements made by the competition (I'm thinking of CONNECT BY in Oracle SQL); and (2). Never make the change consistent with your own syntax constructs (Why didn't they just make Views capable of recursion and then introduce CREATE @var VIEW into the T-SQL construction similar to what they did with CREATE @var TABLE?). That aside, the ability to do Tree operations is something that I can use in SQL. It will be interesting to see how the performance stacks up against Nested Sets.
Posted to general by Chris Rathman on 11/7/03; 1:53:40 PM

Ehud Lamm - Re: SQL Server  blueArrow
11/10/2003; 3:53:09 AM (reads: 541, responses: 0)
I admit I am not sure how is this supposed to work. Can you explain a bit?

Chris Rathman - Re: SQL Server  blueArrow
11/10/2003; 7:41:19 AM (reads: 526, responses: 1)
Think of it as a staged set of queries. The first query must not be self-referential, since it establishes the base result set used for recursion. The second query then is free to use the results of the first part of the union recursively. The second query then sends its result to the second part recursively (excluding the results it was sent). The recursion ends when there are no more results from the query - in this case when you hit the bottom of the tree where an employee is not a manager of anyone below them.

Taking a simple set of data, we might have something like:

INSERT INTO Employees(empid, empname, mgrid) VALUES(7, "Chris", NULL)
INSERT INTO Employees(empid, empname, mgrid) VALUES(8, "Hank", 7)
INSERT INTO Employees(empid, empname, mgrid) VALUES(9, "Fred", 8)

The Anchor in the above query returns a single result:

(empid = 7, empname = "Chris", mgrid = NULL, lvl = 0)

Then this is fed into the join of the second query, returning another row:

(empid = 8, empname = "Hank", mgrid = 7, lvl = 1)

Which is fed into the recursion and then returns:

(empid = 9, empname = "Fred", mgrid = 8, lvl = 2)

Which then feeds into the final recursion where an empty set is returned:

(no results)

The final result set returned by the query would be the union of all the results:

(empid = 7, empname = "Chris", mgrid = NULL, lvl = 0)
(empid = 8, empname = "Hank", mgrid = 7, lvl = 1)
(empid = 9, empname = "Fred", mgrid = 8, lvl = 2)

As I said, it's not very consistent with any other SQL syntax that exists in T-SQL or PL/SQL.

Ehud Lamm - Re: SQL Server  blueArrow
11/10/2003; 8:40:03 AM (reads: 533, responses: 0)
And the db engine executes these queries repeatedly (until the fixed point), without doing anything smart to them or with them?

Chris Rathman - Re: SQL Server  blueArrow
11/10/2003; 8:51:34 AM (reads: 520, responses: 1)
That's kind of why I'm interested in how the performance will hold up - as compared to a Nested Set implementation. Nested Sets are a pain on the INSERT, UPDATE, and DELETE operations, but they are fast on the SELECT operations because all the operation on the Tree can be handled as a normal set type operation.

Theoretically, the recursion shouldn't be much trickier than the PL/SQL CONNECT BY syntax. Problem is, though, that there are some other variations possible in the recursive CTE's arrangement. So certain assumptions in the connect by syntax may not hold in the CTE syntax.

That said, the links given for the new syntax really don't go into how MS implements the syntax nor what sort of optimizations they plan on using.

Ehud Lamm - Re: SQL Server  blueArrow
11/10/2003; 8:55:03 AM (reads: 533, responses: 0)
Thanks for the explanation. If you learn more, let us know.

Chris Rathman - Re: SQL Server  blueArrow
11/10/2003; 1:23:11 PM (reads: 507, responses: 0)
After a bit more research, I found that the recursive CTE's originated with DB2 and was part of the proposed SQL3.