Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LISTAGG equivalent with windowing clause

In oracle, the LISTAGG function allows me to use it analytically with a OVER (PARTITION BY column..) clause. However, it does not support use of windowing with the ROWS or RANGE keywords.

I have a data set from a store register (simplified for the question). Note that the register table's quantity is always 1 - one item, one transaction line.

TranID TranLine ItemId OrderID Dollars Quantity
------ -------- ------ ------- ------- --------
1      101      23845  23      2.99    1
1      102      23845  23      2.99    1
1      103      23845  23      2.99    1
1      104      23845  23      2.99    1
1      105      23845  23      2.99    1

I have to "match" this data to a table in an special order system where items are grouped by quantity. Note that the system can have the same item ID on multiple lines (components ordered may be different even if the item is the same).

ItemId OrderID Order Line Dollars Quantity
------ ------- ---------- ------- --------
23845  23      1          8.97    3
23845  23      2          5.98    2

The only way I can match this data is by order id, item id and dollar amount.

Essentially I need to get to the following result.

ItemId OrderID Order Line Dollars Quantity Tran ID  Tran Lines
------ ------- ---------- ------- -------- -------  ----------
23845  23      1          8.97    3        1        101;102;103
23845  23      2          5.98    2        1        104;105

I don't specifically care if the tran lines are ordered in any way, all I care is that the dollar amounts match and that I don't "re-use" a line from the register in computing the total on the special order. I don't need the tran lines broken out into a table - this is for reporting purposes and the granularity never goes back down to the register transaction line level.

My initial thinking was that I can do this with analytic functions to do a "best match" to identify the the first set of rows that match to the dollar amount and quantity in the ordering system, giving me a result set like:

TranID TranLine ItemId OrderID Dollars Quantity CumDollar  CumQty
------ -------- ------ ------- ------- -------- --------   ------
1      101      23845  23      2.99    1        2.99       1
1      102      23845  23      2.99    1        5.98       2
1      103      23845  23      2.99    1        8.97       3
1      104      23845  23      2.99    1        11.96      4
1      105      23845  23      2.99    1        14.95      5

So far so good. But I then try to add LISTAGG to my query:

SELECT tranid, tranline, itemid, orderid, dollars, quantity, 
       SUM(dollars) OVER (partition by tranid, itemid, orderid order by tranline) cumdollar,
       SUM(quantity) OVER (partition by tranid, itemid, orderid order by tranline) cumqty
       LISTAGG (tranline) within group (order by tranid, itemid, orderid, tranline) OVER (partition by tranid, itemid, orderid)
FROM table

I discover that it always returns a full agg instead of a cumulative agg:

TranID TranLine ItemId OrderID Dollars Quantity CumDollar  CumQty ListAgg
------ -------- ------ ------- ------- -------- --------   ------ -------
1      101      23845  23      2.99    1        2.99       1      101;102;103;104;105
1      102      23845  23      2.99    1        5.98       2      101;102;103;104;105
1      103      23845  23      2.99    1        8.97       3      101;102;103;104;105
1      104      23845  23      2.99    1        11.96      4      101;102;103;104;105
1      105      23845  23      2.99    1        14.95      5      101;102;103;104;105

So this isn't useful.

I would much prefer to do this in SQL if at all possible. I am aware that I can do this with cursors & procedural logic.

Is there any way to do windowing with the LISTAGG analytic function, or perhaps another analytic function which would support this?

I'm on 11gR2.

like image 801
N West Avatar asked Jun 08 '12 16:06

N West


2 Answers

The only way I can think of to achieve this is with a correlated subquery:

WITH CTE AS
(   SELECT  TranID, 
            TranLine, 
            ItemID, 
            OrderID, 
            Dollars, 
            Quantity, 
            SUM(dollars) OVER (PARTITION BY TranID, ItemID, OrderID ORDER BY TranLine) AS CumDollar, 
            SUM(Quantity) OVER (PARTITION BY TranID, ItemID, OrderID ORDER BY TranLine) AS CumQuantity
    FROM    T
)
SELECT  TranID, 
        TranLine, 
        ItemID, 
        OrderID, 
        Dollars, 
        Quantity, 
        CumDollar, 
        CumQuantity, 
        (   SELECT  LISTAGG(Tranline, ';') WITHIN GROUP(ORDER BY CumQuantity)
            FROM    CTE T2
            WHERE   T1.CumQuantity >= T2.CumQuantity
            AND     T1.ItemID = T2.ItemID
            AND     T1.OrderID = T2.OrderID
            AND     T1.TranID = T2.TranID
            GROUP BY tranid, itemid, orderid
        ) AS ListAgg
FROM    CTE T1;

I realise this doesn't give the exact output you were asking for, but hopefully it is enough to overcome the problem of the cumulative LISTAGG and get you on your way.

I've set up an SQL Fiddle to demonstrate the solution.

like image 76
GarethD Avatar answered Sep 28 '22 23:09

GarethD


In your example, your store register table contains 5 rows and your special order system table contains 2 rows. Your expected result set contains the two rows from your special order system table and all "tranlines" of your store register table should be mentioned in the "Tran Line" column.

This means you need to aggregate those 5 rows to 2 rows. Meaning you don't need the LISTAGG analytic function, but the LISTAGG aggregate function.

Your challenge is to join the rows of the store register table to the right row in the special order system table. You were well on your way by calculating the running sum of dollars and quantities. The only step missing is to define ranges of dollars and quantities by which you can assign each store register row to each special order system row.

Here is an example. First define the tables:

SQL> create table store_register_table (tranid,tranline,itemid,orderid,dollars,quantity)
  2  as
  3  select 1, 101, 23845, 23, 2.99, 1 from dual union all
  4  select 1, 102, 23845, 23, 2.99, 1 from dual union all
  5  select 1, 103, 23845, 23, 2.99, 1 from dual union all
  6  select 1, 104, 23845, 23, 2.99, 1 from dual union all
  7  select 1, 105, 23845, 23, 2.99, 1 from dual
  8  /

Table created.

SQL> create table special_order_system_table (itemid,orderid,order_line,dollars,quantity)
  2  as
  3  select 23845, 23, 1, 8.97, 3 from dual union all
  4  select 23845, 23, 2, 5.98, 2 from dual
  5  /

Table created.

And the query:

SQL> with t as
  2  ( select tranid
  3         , tranline
  4         , itemid
  5         , orderid
  6         , sum(dollars) over (partition by itemid,orderid order by tranline) running_sum_dollars
  7         , sum(quantity) over (partition by itemid,orderid order by tranline) running_sum_quantity
  8      from store_register_table srt
  9  )
 10  , t2 as
 11  ( select itemid
 12         , orderid
 13         , order_line
 14         , dollars
 15         , quantity
 16         , sum(dollars) over (partition by itemid,orderid order by order_line) running_sum_dollars
 17         , sum(quantity) over (partition by itemid,orderid order by order_line) running_sum_quantity
 18      from special_order_system_table
 19  )
 20  , t3 as
 21  ( select itemid
 22         , orderid
 23         , order_line
 24         , dollars
 25         , quantity
 26         , 1 + lag(running_sum_dollars,1,0) over (partition by itemid,orderid order by order_line) begin_sum_dollars
 27         , running_sum_dollars end_sum_dollars
 28         , 1 + lag(running_sum_quantity,1,0) over (partition by itemid,orderid order by order_line) begin_sum_quantity
 29         , running_sum_quantity end_sum_quantity
 30      from t2
 31  )
 32  select t3.itemid "ItemID"
 33       , t3.orderid "OrderID"
 34       , t3.order_line "Order Line"
 35       , t3.dollars "Dollars"
 36       , t3.quantity "Quantity"
 37       , t.tranid "Tran ID"
 38       , listagg(t.tranline,';') within group (order by t3.itemid,t3.orderid) "Tran Lines"
 39    from t3
 40         inner join t
 41           on (   t.itemid = t3.itemid
 42              and t.orderid = t3.orderid
 43              and t.running_sum_dollars between t3.begin_sum_dollars and t3.end_sum_dollars
 44              and t.running_sum_quantity between t3.begin_sum_quantity and t3.end_sum_quantity
 45              )
 46   group by t3.itemid
 47       , t3.orderid
 48       , t3.order_line
 49       , t3.dollars
 50       , t3.quantity
 51       , t.tranid
 52  /

    ItemID    OrderID Order Line    Dollars   Quantity    Tran ID Tran Lines
---------- ---------- ---------- ---------- ---------- ---------- --------------------
     23845         23          1       8.97          3          1 101;102;103
     23845         23          2       5.98          2          1 104;105

2 rows selected.

Regards,
Rob.

like image 22
Rob van Wijk Avatar answered Sep 28 '22 23:09

Rob van Wijk