I have the following tables:
work_units
- self explanatoryworkers
- self explanatoryskills
- every work unit requires a number of skills if you want to work on it. Every worker is proficient in a number of skills.work_units_skills
- join tableworkers_skills
- join tableA worker can request the next appropriate free highest priority (whatever that means) unit of work to be assigned to her.
Currently I have:
SELECT work_units.*
FROM work_units
-- some joins
WHERE NOT EXISTS (
SELECT skill_id
FROM work_units_skills
WHERE work_unit_id = work_units.id
EXCEPT
SELECT skill_id
FROM workers_skills
WHERE worker_id = 1 -- the worker id that made the request
)
-- AND a bunch of other conditions
-- ORDER BY something complex
LIMIT 1
FOR UPDATE SKIP LOCKED;
This condition makes the query 8-10 times slower though.
Is there a better way to express that a work_units
's skills should be a subset of the workers
's skills or something to improve the current query?
Some more context:
skills
table is fairly small.work_units
and workers
tend to have very few associated skills.work_units_skills
has index on work_unit_id
.workers_skills
into a CTE. This gave a slight improvement (10-15%), but it's still too slow.In terms of readability though, especially in complex queries that have multiple joins, it is easier to spot join conditions when they are placed in the ON clause and filter conditions when they are placed in the WHERE clause.
“Is there a performance difference between putting the JOIN conditions in the ON clause or the WHERE clause in MySQL?” No, there's no difference. The following queries are algebraically equivalent inside MySQL and will have the same execution plan.
The Minus Operator in SQL is used with two SELECT statements. The MINUS operator is used to subtract the result set obtained by first SELECT query from the result set obtained by second SELECT query.
One simple speed-up would be to use EXCEPT ALL
instead of EXCEPT
. The latter removes duplicates, which is unnecessary here and can be slow.
An alternative that would probably be faster is to use a further NOT EXISTS
instead of the EXCEPT
:
...
WHERE NOT EXISTS (
SELECT skill_id
FROM work_units_skills wus
WHERE work_unit_id = work_units.id
AND NOT EXISTS (
SELECT skill_id
FROM workers_skills ws
WHERE worker_id = 1 -- the worker id that made the request
AND ws.skill_id = wus.skill_id
)
)
Demo
http://rextester.com/AGEIS52439 - with the the LIMIT
removed for testing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With