Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL select records with sum greater than threshold

Tags:

mysql

I need to select records based on file size listed in a MySQL database table, with a single query (no store procedures). The record set should contain all records where the sum total file size equals or if needed exceeds a specific threshold. (Example, threshold = 30, results return 3 records with file sizes equal to 10, 10, 20 or 10, 10, 10 or one record with a file size of 32)

table

+----+---------+-----------+
| id | user_id | fileSize  |
+----+---------+-----------+
|  1 |       1 |      9319 |
|  2 |       1 |     51683 |
|  3 |       1 |     19776 |
|  4 |       1 |    395890 |
|  5 |       1 |      7132 |
|  6 |       1 |     97656 |
|  7 |       1 |      9798 |
|  9 |       1 |     16096 |
| 10 |       1 |    113910 |
| 11 |       1 |    160037 |
+----+---------+-----------+

After researching and trying a number of different solutions the best query that I have come up with looks like this:

SELECT f1.user_id, f1.id AS file_id, f1.fileSize, SUM(f2.fileSize) AS totalSum
FROM files AS f1 
INNER JOIN files AS f2 ON f1.id >= f2.id 
WHERE f1.user_id = 1
GROUP BY f1.id 
HAVING totalSum <= 350000;

Example results

+---------+---------+-----------+----------+
| user_id | file_id |  fileSize | totalSum |
+---------+---------+-----------+----------+
|       1 |       1 |      9319 |     9319 |
|       1 |       2 |     51683 |    61002 |
|       1 |       3 |     19776 |    80778 |
+---------+---------+-----------+----------+

Desired results

+---------+---------+-----------+----------+
| user_id | file_id |  fileSize | totalSum |
+---------+---------+-----------+----------+
|       1 |       1 |      9319 |     9319 |
|       1 |       2 |     51683 |    61002 |
|       1 |       3 |     19776 |    80778 |
|       1 |       4 |    395890 |   476668 |
+---------+---------+-----------+----------+

Or

+---------+---------+-----------+----------+
| user_id | file_id |  fileSize | totalSum |
+---------+---------+-----------+----------+
|       1 |       3 |    395890 |   395890 |
+---------+---------+-----------+----------+

What isn't working with the query above is that the threshold will never be met, as it is based on HAVING lesser than the threshold (greater than just returns crazy amounts of records well above the threshold). Also, if there are any records in the set that have a file size exceeding the threshold, the query result sometimes returns empty. Ideal results would meet or slightly exceed the threshold and may contain many records or a single record if the single file size matched or exceeded the threshold.

Any help would be appreciated. I think this is the first time I have posted a question online in about five years. Seriously, been stuck on this for a week. ~ Thx

like image 522
Michael Babcock Avatar asked Mar 19 '15 20:03

Michael Babcock


1 Answers

This seems to be fitted for a UNION resultset. So you have to get 2 queries (one for each "criteria") and join their result using union.

First query would become:

SELECT f1.user_id, f1.id AS file_id, SUM(f1.fileSize) AS totalSum
FROM files AS f1 
WHERE f1.user_id = 1
GROUP BY f1.id 
HAVING totalSum <= 350000;

Now you need the query that select if size is too big:

SELECT f1.user_id, f1.id AS file_id, MAX(f1.fileSize) AS max
FROM files AS f1 
WHERE f1.user_id = 1
GROUP BY f1.id 
HAVING max >= 350000;

Next you want to combine them in a single query. Since both have the same fields you can simply "union" the result

SELECT f1.user_id, f1.id AS file_id, SUM(f1.fileSize) AS totalSum
FROM files AS f1 
WHERE f1.user_id = 1
GROUP BY f1.id 
HAVING totalSum <= 350000
UNION
SELECT f1.user_id, f1.id AS file_id, MAX(f1.fileSize) AS max
FROM files AS f1 
WHERE f1.user_id = 1
GROUP BY f1.id 
HAVING max >= 350000;

PS: You had "ON f1.id >= f2.id" as join criteria, not sure why the > that would be very case-specific :)

like image 176
ROunofF Avatar answered Nov 10 '22 01:11

ROunofF