I have a table with a parent/child relationship along with a date created column referencing itself. I want to display each parent record and all the descendants ordered by the most recent 'activity' on the node. So if row number 1 which was created a long time ago has a new child added to it (or a new child added to one its children for example), then I want it at the top of the results.
I'm having trouble getting this to work currently.
My table structure is as follows:
CREATE TABLE [dbo].[Orders](
[OrderId] [int] NOT NULL,
[Orders_OrderId] [int] NULL,
[DateOrdered] datetime)
I have written the following SQL to pull the information out:
WITH allOrders AS
(SELECT po.orderid, po.Orders_OrderId, po.DateOrdered, 0 as distance,
row_number() over (order by DateOrdered desc) as RN1
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, b2.DateOrdered, c.distance + 1,
c.RN1
FROM orders b2
INNER JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
)
SELECT * from allOrders
where RN1 between 0 and 2
order by rn1 asc, distance asc
Is there any way I can 'aggregate' the results of the recursive selects, so that I could select the maximum date across an entire 'parent' node?
SQLFiddle demonstration: http://sqlfiddle.com/#!3/ca6cb/11 (record number 1 should be first as it has a child that was updated recently)
Update Thanks to the suggestion from @twrowsell I have the following query which does work but seems really clunky and has some performance issues, I feel I shouldn't have to have 3 CTE's to achieve this. Is there any way it can be condensed whilst preserving the 'row numbers' (as this is for a user display with paging)?
WITH allOrders AS
(SELECT po.orderid, po.Orders_OrderId, 0 as distance, po.DateOrdered, po.orderid as [rootId]
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, c.distance + 1, b2.DateOrdered, c.[rootId]
FROM orders b2
INNER JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
),
mostRecentOrders as (
SELECT *,
MAX(DateOrdered) OVER (PARTITION BY rootId) as [HighestOrderId]
from allOrders
),
pagedOrders as (
select *, dense_rank() over (order by [HighestOrderId] desc) as [PagedRowNumber] from mostRecentOrders)
SELECT * from pagedOrders
where PagedRowNumber between 0 and 2
order by [HighestOrderId] desc
Also, I could use MAX(orderid)
as orderid is the ident and datecreated can't be updated in my scenario after its created.
Updated SQLFiddle: http://sqlfiddle.com/#!3/ca6cb/41
First, you need to store the "root" order ID so that you can differentiate between different "trees" of orders. Once you have that, you can aggregate and sort your data.
As far as I can tell, you will need at least one CTE to build the tree and a second to do the rankings since you can't use DENSE_RANK()
in a WHERE
clause.
The following query uses a temp table to store the tree. The query selects from the tree twice, once for the rows and a second time for the rankings. If I used a CTE for storing the tree, it would have to build it twice, since the CTE is basically just a reusable subquery (it would be rebuilt for each distinct time it is used). Using a temp table ensures that I only need to build it once.
Here is the SQL:
DECLARE @Offset INT = 0;
DECLARE @Fetch INT = 2;
-- Create the Order Trees
WITH OrderTree AS (
SELECT po.orderid AS RootOrderID,
po.orderid,
po.Orders_OrderId,
po.DateOrdered,
0 AS distance
FROM orders po WHERE po.Orders_OrderId IS NULL
UNION/**/ALL
SELECT parent.RootOrderID,
child.orderid,
child.Orders_OrderId,
child.DateOrdered,
parent.distance + 1 AS distance
FROM orders child
INNER JOIN OrderTree parent
ON child.Orders_OrderId = parent.orderid
)
SELECT *
INTO #OrderTree
FROM OrderTree;
-- Rank the order trees by MAX(DateOrdered)
WITH
Rankings AS (
SELECT RootOrderID,
MAX(DateOrdered) AS MaxDate,
ROW_NUMBER() OVER(ORDER BY MAX(DateOrdered) DESC, RootOrderID ASC) AS Rank
FROM #OrderTree
GROUP BY RootOrderID
)
-- Get the next @Fetch trees, starting at rank @Offset+1
SELECT TREE.*,
R.MaxDate,
R.Rank
FROM Rankings R
INNER JOIN #OrderTree TREE
ON R.RootOrderID = TREE.RootOrderID
WHERE R.Rank BETWEEN @Offset+1 AND (@Fetch+@Offset)
ORDER BY R.Rank ASC, TREE.distance ASC;
SQLFiddle
Note: The /**/
between UNION
and ALL
is a workaround for this issue.
I built my own "orders" table using data in an existing table I had in my database and did some bench-marking against the 3-CTE query in the question. This slightly outperforms it on a large pool of data (117 trees with a total of 37215 orders, max depth of 11). I benchmarked by running each query with STATISTICS IO
and STATISTICS TIME
turned on, clearing the cache and buffers before each run.
Below are the results of both queries, plus the results of the recursive CTE shared by both:
╔════════════╦══════════╦════════════╦══════════════╗
║ Query ║ CPU Time ║ Scan Count ║ Logical Reads║
╠════════════╬══════════╬════════════╬══════════════╣
║ Tree CTE ║ 24211ms ║ 4 ║ 1116243 ║
╟────────────╫──────────╫────────────╫──────────────╢
║ 3-CTE ║ 24789ms ║ 7 ║ 1192221 ║
║ Temp Table ║ 24384ms ║ 6 ║ 1116549 ║
╚════════════╩══════════╩════════════╩══════════════╝
It appears that the bulk of both of these queries is the recursive order-tree CTE. Removing the shared cost of the recursive CTE gives the following results:
╔════════════╦══════════╦════════════╦══════════════╗
║ Query ║ CPU Time ║ Scan Count ║ Logical Reads║
╠════════════╬══════════╬════════════╬══════════════╣
║ 3-CTE ║ 578ms ║ 3 ║ 75978 ║
║ Temp Table ║ 173ms ║ 2 ║ 306 ║
╚════════════╩══════════╩════════════╩══════════════╝
Based on these results, I highly recommend you add a RootOrderID column to your orders table to avoid having to use a potentially very costly recursive CTE.
Would using MAX on DateOrdered in an OVER clause in the outer select work..
WITH allOrders AS
(
SELECT po.orderid, po.Orders_OrderId, po.DateOrdered, 0 as distance,
row_number() over (order by DateOrdered desc) as RN1
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, b2.DateOrdered, c.distance + 1,
c.RN1
FROM orders b2
INNER JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
)
SELECT *, MAX(DateOrdered) OVER (PARTITION BY Orders_OrderId) from allOrders
where RN1 between 0 and 2
order by rn1 asc, distance asc
EDIT: Sorry I misinterpreted your requirement first time. It looks like you want to partition your results by the RN1 field not Orders_OrderId so your outer select will be something like..
SELECT MAX(DateOrdered) OVER (PARTITION BY RN1 ),* from allOrders
where RN1 between 0 and 2
order by rn1 asc, distance asc
Have a look at the following:
;WITH allOrders AS
(SELECT po.orderid, po.Orders_OrderId, po.DateOrdered, 0 as distance, po.orderid as [parentOrder]
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, b2.DateOrdered, c.distance + 1, c.[parentOrder]
FROM orders b2
INNER JOIN allOrders c ON b2.Orders_OrderId = c.orderid
)
SELECT a.OrderId
,a.Orders_OrderId
,a.DateOrdered
,top1.DateOrdered as HIghestDate
,a.distance
,a.parentOrder
FROM allOrders a
INNER JOIN (SELECT TOP 2 parentOrder, MAX(DateOrdered)as highestdates FROM allOrders GROUP BY parentOrder ORDER BY MAX(DateOrdered)DESC)b on a.parentOrder=b.parentOrder
OUTER APPLY (SELECT TOP 1 parentOrder, DateOrdered FROM allOrders top1 WHERE a.parentOrder=top1.parentOrder ORDER BY top1.DateOrdered DESC)top1
SQLFiddle
I'm having a hard time understanding your exact full needs, including the paging situation. You could provide the expected result set for the sample you supplied, it would be easier to check against.
Anyway, it seems your main difficulty is on this:
Is there any way I can 'aggregate' the results of the recursive selects, so that I could select the maximum date across an entire 'parent' node?
... this can be easily done with that recursive CTE and an APPLY.
I'm not sure of what do you want exactly, so I made these two fiddles:
SQL Fiddle 1 - Here all children are together based on the root order, i.e., order 3 is with its parent's (order 2) parent (order 1).
SQL Fiddle 2 - Here the children are grouped with their immediate parent, and also parents become roots, so order 2 doesn't get to the top with its parent (order 1).
I think you'll be going for some modification of the first one.
Again, it's really important to provide your expected result in questions like this one, or you'll get a lot of trial and error answers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With