Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL: SUM() with JOIN returns incorrect values

Tags:

sql

join

mysql

sum

I am trying to fetch a SUM() for each user in a table, but MySQL is returning the wrong values.

This is how it should look (http://sqlfiddle.com/#!2/7b988/4/0)

user    amount
110     20.898319244385
114     43.144836425781
115     20.487638473511
116     26.07483291626
117     93.054000854492

But this is how it does look (http://sqlfiddle.com/#!2/7b988/2/0)

user    amount
110     167.186554
114     129.434509
115     143.413469
116     208.598663
117     744.432007

This is the query I am trying to run:

SELECT 
    blocks.user_id, 
    SUM(payout_history.amount) as amount
FROM blocks
LEFT JOIN payout_history
ON blocks.user_id = payout_history.user_id
WHERE confirms > 520
GROUP BY blocks.user_id

What am I doing wrong?

like image 984
Tyler Burleigh Avatar asked Dec 12 '22 11:12

Tyler Burleigh


1 Answers

Try this query:

SELECT bl.user_id, SUM( ph.amount ) PAIDOUT
FROM (
   SELECT distinct blocks.user_id 
   FROM blocks
   WHERE confirms > 520
) bl
LEFT JOIN  payout_history ph
ON bl.user_id = ph.user_id
GROUP BY ph.user_id
;

SQLFiddle --> http://sqlfiddle.com/#!2/7b988/48



--- EDIT --- an explanation how the query works (or rather why your query doesn't work) ----

Looking at expected results it seems that the query should calculate a sum of amount column for each user_id, but only for those user_id, that are also in the blocks table, and have a blocks.confirms value grather than 520.
A simple join (also left outer join) cannot work in this case, because the blocks table can contain many records for the same user_id, for example a query that returns rows for only user_id=110 gives the following results:

SELECT *
FROM blocks
WHERE confirms > 520
      AND user_id = 110;

+ ------- + ------------ + ----------- + ------------- +
| id      | user_id      | reward      | confirms      |
+ ------- + ------------ + ----------- + ------------- +
| 0       | 110          | 20.89832115 | 521           |
| 65174   | 110          | 3.80357075  | 698           |
| 65204   | 110          | 4.41933060  | 668           |
| 65218   | 110          | 4.69059801  | 654           |
| 65219   | 110          | 4.70222521  | 653           |
| 65230   | 110          | 4.82805490  | 642           |
| 65265   | 110          | 5.25058079  | 607           |
| 65316   | 110          | 6.17262650  | 556           |
+ ------- + ------------ + ----------- + ------------- +

The straigh join (and LEFT/RIGHT outer join) works in this way, that takes each record from the first joinded table, and pair this record (combine it) with all rows from the other joinded table thet meet the join condition.

In our case the left join produces a below resultset:

SELECT *
FROM blocks
LEFT JOIN payout_history
ON blocks.user_id = payout_history.user_id
WHERE confirms > 520
    AND blocks.user_id = 110;
+ ------- + ------- + ----------- + -------- + --- + ------- + ----------- +
| id      | user_id | reward      | confirms | id  | user_id | amount      |
+ ------- + ------- + ----------- + -------- + --- + ------- + ----------- +
| 0       | 110     | 20.89832115 | 521      | 1   | 110     | 20.898319   |
| 65174   | 110     | 3.80357075  | 698      | 1   | 110     | 20.898319   |
| 65204   | 110     | 4.41933060  | 668      | 1   | 110     | 20.898319   |
| 65218   | 110     | 4.69059801  | 654      | 1   | 110     | 20.898319   |
| 65219   | 110     | 4.70222521  | 653      | 1   | 110     | 20.898319   |
| 65230   | 110     | 4.82805490  | 642      | 1   | 110     | 20.898319   |
| 65265   | 110     | 5.25058079  | 607      | 1   | 110     | 20.898319   |
| 65316   | 110     | 6.17262650  | 556      | 1   | 110     | 20.898319   |
+ ------- + ------- + ----------- + -------- + --- + ------- + ----------- +

and now if we add SUM( amount ) .... GROUP BY user_id, MySql will calucate a sum of all amount values from the above resultset ( 8 rows * 20.898 = ~ 167.184 )

SELECT blocks.user_id, sum( amount)
FROM blocks
LEFT JOIN payout_history
ON blocks.user_id = payout_history.user_id
WHERE confirms > 520
    AND blocks.user_id = 110
GROUP BY blocks.user_id;
+ ------------ + ----------------- +
| user_id      | sum( amount)      |
+ ------------ + ----------------- +
| 110          | 167.186554        |
+ ------------ + ----------------- +



As you see in this case the join doesn't give us desired results - we need something named a semi join - below are different variants of semi joins, give them a try:

SELECT bl.user_id, SUM( ph.amount ) PAIDOUT
FROM (
   SELECT distinct blocks.user_id 
   FROM blocks
   WHERE confirms > 520
) bl
LEFT JOIN  payout_history ph
ON bl.user_id = ph.user_id
GROUP BY ph.user_id
;


SELECT ph.user_id, SUM( ph.amount ) PAIDOUT
FROM payout_history ph
WHERE ph.user_id IN (
     SELECT user_id FROM blocks
     WHERE confirms > 520
  )
GROUP BY ph.user_id
;

SELECT ph.user_id, SUM( ph.amount ) PAIDOUT
FROM payout_history ph
WHERE EXISTS (
     SELECT 1 FROM blocks bl
     WHERE bl.user_id = ph.user_id
        AND bl.confirms > 520
  )
GROUP BY ph.user_id
;
like image 177
krokodilko Avatar answered Dec 28 '22 02:12

krokodilko