Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select where cumulative sum is less than a number (in order of priority)

I have a table with id, cost, and priority columns:

create table a_test_table (id number(4,0), cost number(15,2), priority number(4,0));

insert into a_test_table (id, cost, priority) values (1, 1000000, 10);
insert into a_test_table (id, cost, priority) values (2, 10000000, 9);
insert into a_test_table (id, cost, priority) values (3, 5000000, 8);
insert into a_test_table (id, cost, priority) values (4, 19000000, 7);
insert into a_test_table (id, cost, priority) values (5, 20000000, 6);
insert into a_test_table (id, cost, priority) values (6, 15000000, 5);
insert into a_test_table (id, cost, priority) values (7, 2000000, 4);
insert into a_test_table (id, cost, priority) values (8, 3000000, 3);
insert into a_test_table (id, cost, priority) values (9, 3000000, 2);
insert into a_test_table (id, cost, priority) values (10, 8000000, 1);
commit;

select 
    id,
    to_char(cost, '$999,999,999') as cost,
    priority
from 
    a_test_table;
        ID COST            PRIORITY
---------- ------------- ----------
         1    $1,000,000         10
         2   $10,000,000          9
         3    $5,000,000          8
         4   $19,000,000          7
         5   $20,000,000          6
         6   $15,000,000          5
         7    $2,000,000          4
         8    $3,000,000          3
         9    $3,000,000          2
        10    $8,000,000          1

Starting with the highest priority (descending), I want to select the rows where the cost adds up to less than (or equal to) $20,000,000.

The result would look like this:

       ID COST            PRIORITY
---------- ------------- ----------
         1    $1,000,000         10
         2   $10,000,000          9
         3    $5,000,000          8
         7    $2,000,000          4

      Total: $18,000,000

How can I do this with Oracle SQL?

like image 984
User1974 Avatar asked Feb 08 '19 19:02

User1974


1 Answers

Here is a way to do it in pure SQL. I won't swear there isn't a better way.

Basically, it uses a recursive common table expression (i.e., WITH costed...) to compute every possible combination of elements totaling less than 20,000,000.

Then it gets the first full path from that result.

Then, it gets all the rows in that path.

NOTE: the logic assumes that no id is longer than 5 digits. That's the LPAD(id,5,'0') stuff.

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000),
best_path as (  
SELECT *
FROM   costed c 
where not exists ( SELECT 'longer path' FROM costed c2 WHERE c2.path like c.path || '|%' )
order by path
fetch first 1 row only )
SELECT att.* 
FROM best_path cross join a_test_table att
WHERE best_path.path like '%' || lpad(att.id,5,'0') || '%'
order by att.priority desc;
+----+----------+----------+
| ID |   COST   | PRIORITY |
+----+----------+----------+
|  1 |  1000000 |       10 |
|  2 | 10000000 |        9 |
|  3 |  5000000 |        8 |
|  7 |  2000000 |        4 |
+----+----------+----------+

UPDATE - Shorter version

This version uses MATCH_RECOGNIZE to find all the rows in the best group following the recursive CTE:

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000)
  search depth first by priority desc set ord
SELECT id, cost, priority
FROM   costed c 
MATCH_RECOGNIZE (
  ORDER BY path
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN (STRT ADDON*)
  DEFINE
    ADDON AS ADDON.PATH = PREV(ADDON.PATH) || '|' || LPAD(ADDON.ID,5,'0')
    )
WHERE mno = 1
ORDER BY priority DESC;

UPDATE -- Even shorter version, using clever idea from the SQL*Server link the OP posted

*Edit: removed use of ROWNUM=1 in anchor part of recursive CTE, since it depended on the arbitrary order in which rows would be returned. I'm surprised no one dinged me on that. *

WITH costed (id, cost, priority, running_cost) as 
( SELECT id, cost, priority, cost running_cost
  FROM   ( SELECT * FROM a_test_table
           WHERE  cost <= 20000000
           ORDER BY priority desc
           FETCH FIRST 1 ROW ONLY )
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost
  FROM   costed CROSS APPLY ( SELECT b.*
                              FROM   a_test_table b 
                              WHERE  b.priority < costed.priority
                              AND    b.cost + costed.running_cost <= 20000000
                              FETCH FIRST 1 ROW ONLY
                              ) a
)
CYCLE id SET is_cycle TO 'Y' DEFAULT 'N'
select id, cost, priority from costed
order by priority desc
like image 178
Matthew McPeak Avatar answered Sep 28 '22 07:09

Matthew McPeak