Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are the result of COUNT double when I do two join? [duplicate]

I have this tables

device

 id      name         groupId     serviceId
791   Mamie Ortega      205         1832

group

 id   serviceId
205     1832

record

 id          date                      deviceId
792   2017-07-13 13:30:19.740360         784
793   2017-07-13 13:30:19.742799         784

alarms

 id    status    deviceId
241      new        784
242      new        784 

I'm running this query

SELECT device.id, device.name, COUNT(records.id) AS "last24HMessagesCount", COUNT(alarms.id) AS "activeAlarmsCount"
FROM device
  INNER JOIN "group" AS "group" ON "device"."groupId" = "group"."id" AND "group"."id" = '205'
  LEFT OUTER JOIN "record" AS "records" ON "device"."id" = "records"."deviceId" AND "records"."date" > '2017-07-12 11:43:02.838 +00:00'
  LEFT OUTER JOIN "alarm" AS "alarms" ON "device"."id" = "alarms"."deviceId" AND "alarms"."status" = 'new'
WHERE "device"."serviceId" = 1832
GROUP BY device.id;

Which give me this result

 id      name       last24HMessagesCount      activeAlarmsCount   
791   Mamie Ortega         4                          4

This result is wrong, I'm supposed to have 2 for last24HMessagesCount and activeAlarmsCount.

If I remove one of the count, last24HMessagesCount for example and execute

SELECT device.id, device.name, COUNT(alarms.id) AS "activeAlarmsCount"
FROM device
  INNER JOIN "group" AS "group" ON "device"."groupId" = "group"."id" AND "group"."id" = '205'
  LEFT OUTER JOIN "alarm" AS "alarms" ON "device"."id" = "alarms"."deviceId" AND "alarms"."status" = 'new'
WHERE "device"."serviceId" = 1832
GROUP BY device.id;

The result is correct

 id      name       activeAlarmsCount   
791   Mamie Ortega         2

I do not understand, why are the counts double?

like image 535
ThomasThiebaud Avatar asked Jul 13 '17 12:07

ThomasThiebaud


2 Answers

This is very simple to answer. You have two record and two alarm. You join these and get four records, which you count.

You can workaround this problem by counting distinct IDs:

COUNT(DISTINCT records.id) AS "last24HMessagesCount",
COUNT(DISTINCT alarms.id) AS "activeAlarmsCount"

but I would not recommend this. Why do you join record and alarm anyway? They are not directly related. What you want to join is the number of record and the number of alarm. So aggregate before joining:

SELECT 
  device.id, 
  device.name, 
  records.cnt AS "last24HMessagesCount", 
  alarms.cnt AS "activeAlarmsCount"
FROM device
LEFT OUTER JOIN 
(
  SELECT deviceId, count(*) AS cnt
  FROM record
  WHERE "date" > '2017-07-12 11:43:02.838 +00:00'
  GROUP BY deviceId
) AS records ON device.id = records.deviceId
LEFT OUTER JOIN 
(
  SELECT deviceId, count(*) AS cnt
  FROM alarm
  WHERE status = 'new'
  GROUP BY deviceId
) AS alarms ON device.id = alarms.deviceId
WHERE device.serviceId = 1832
  AND device.groupId = 205;

(I've removed the unnecessary join to the "group" table.)

like image 77
Thorsten Kettner Avatar answered Sep 30 '22 01:09

Thorsten Kettner


Your joins are producing a Cartesian product along two dimensions. The simplest solution is to use COUNT(DISTINCT):

SELECT device.id, device.name,
       COUNT(DISTINCT records.id) AS "last24HMessagesCount",
       COUNT(DISTINCT alarms.id) AS "activeAlarmsCount"

This works if the counts are not very large. An alternative solution is more scalable. That is to do the aggregation before the LEFT JOINs or using correlated subqueries (or lateral joins).

like image 25
Gordon Linoff Avatar answered Sep 30 '22 02:09

Gordon Linoff