I have 2 many to many tables, and a table to join them.
Officer
Report
report_officer
I want to select all officers that haven't ever been associated with a report or who has not been associated with a report within a certain timeframe.
So far I have tried the following (below doesn't work for me!):
SELECT *
FROM Officer
LEFT JOIN report_officer
ON Officer.id = report_officer.officer_id
LEFT JOIN Report
ON Report.id = report_officer.report_id
WHERE (performanceDate IS NULL
OR performanceDate < "2014-03-23 00:00:00"
OR performanceDate > "2014-04-01 00:00:00"
)
My left join query works only when the officer has ONLY been associated with a report within a certain timeframe, but fails once they have multiple reports.
Result:
+------------+-----------------+
| officer_id | performanceDate |
+------------+-----------------+
| 130 | NULL | # good
| 134 | 2014-03-02 | # bad - officer_id 134 has a performanceDate
| 134 | 2014-03-09 | # on 2014-3-30, I do not want this in the results.
| 134 | 2014-03-16 | #
| 135 | 2014-03-02 | # good
+------------+-----------------+
SQL Fiddle: http://sqlfiddle.com/#!2/1bf72/3 <- in the sql fiddle, please refer to the 'name' field for which columns I am looking to have returned.
Any ideas on how to make this work?
Ideally I would like to make this as simple as possible to work with my ORM. I am using doctrine and would prefer not to start using totally custom code (so if it can be done with only joins, that would be great). I though have a bad feeling I need a sub-query.
SELECT Officer.*, Report.performanceDate FROM Officer
LEFT JOIN report_officer ON Officer.id = report_officer.officer_id
LEFT JOIN Report ON Report.id = report_officer.report_id
AND
(performanceDate > "2014-03-23 00:00:00" AND
performanceDate < "2014-04-01 00:00:00")
WHERE Report.id IS NULL
You only want to join rows within a specific date range, so you must move the constraint into the on
clause of the join and inverse the constraint.
If you want to remove duplicates, you could try a group by
:
SELECT Officer.id, MAX(Report.performanceDate) FROM Officer
LEFT JOIN report_officer ON Officer.id = report_officer.officer_id
LEFT JOIN Report ON Report.id = report_officer.report_id
AND
(performanceDate > "2014-03-23 00:00:00" AND
performanceDate < "2014-04-01 00:00:00")
WHERE Report.id IS NULL
GROUP BY Officer.id
but you have to decide on which date you want to get, if there are multiple performance dates in your requested date range (or you could use GROUP_CONCAT
to gather all dates).
Actually I am relatively sure, that what you want to achieve is not possible with LEFT JOIN
s at all...
What always works are subquery solutions:
SELECT Officer.id as OfficerID, Officer.name,
Report.id as ReportID,
Report.performanceDate
FROM Officer
LEFT JOIN report_officer
ON Officer.id = report_officer.officer_id
LEFT JOIN Report
ON Report.id = report_officer.report_id
WHERE Report.id IS NULL
OR NOT EXISTS (
SELECT * FROM report_officer
INNER JOIN Report ON report_id = Report.id
WHERE officer_id = Officer.id AND
performanceDate > "2014-03-23 00:00:00"
AND performanceDate < "2014-04-01 00:00:00"
)
but these are not that performant... This one looks if there are reports which should prohibit outputting the row.
I want to select all officers that haven't ever been associated with a report or who has not been associated with a report within a certain timeframe.
Your two conditions are redundant: if an officer has not been associated, ever, then he also cannot have been associated in any timeframe, and will be selected by the second condition. If he has reports in the timeframe, then he isn't selected because of the second condition, but he also has at least one report and cannot satisfy the first.
So what you want is, "an officer with zero reports in the timeframe".
To do so, just invert the conditions: first get those reports within the selected timeframe (i.e. those officers you don't want); then LEFT JOIN
with Officer
, requiring that the join yields null. This will get you the other officers, those that have no report in the selected time frame (or maybe no report at all).
In this case, though, you cannot have a report date, since you have no reports (this is more apparent for those officers with no report at all):
SELECT
Officer.id as OfficerID,
Officer.name,
MAX(Report.id) as ReportID,
MAX(performanceDate) AS performanceDate
FROM Officer
LEFT JOIN report_officer ON (Officer.id = report_officer.officer_id)
LEFT JOIN Report ON (Report.id = report_officer.report_id
AND performanceDate BETWEEN 20140323 AND 20140401)
GROUP BY Officer.id, Officer.name
HAVING ReportID IS NULL;
I don't know about Doctrine and HAVING
. If you cannot use a HAVING
clause, you may try to emulate it by running this, which ought to be quite standard:
SELECT
Officer.id as OfficerID,
Officer.name,
COUNT(Report.id) as reports
FROM Officer
LEFT JOIN report_officer ON (Officer.id = report_officer.officer_id)
LEFT JOIN Report ON (Report.id = report_officer.report_id
AND performanceDate BETWEEN 20140323000000 AND 20140401235959)
GROUP BY Officer.id, Officer.name;
and then apply a filter where reports
is equal to 0, i.e., no reports in the given timeframe. You can add MAX(performanceDate) AS performanceDate, MAX(Report.id) AS ReportID
to get the date of the (e.g. latest) report for those officers who do have at least one outside the timeframe. This might be not the report you'd want.
Care must be taken when specifying date range, since YYYYMMDD equals normally YYYYMMDD000000, which could lead to the equivalent of a semi-inclusive range. Otherwise, replace BETWEEN
with performanceDate >= '2014-03-23 00:00:00' AND performanceDate <= '2014-04-01 23:59:59'
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With