Having a problem with one of my SQL queries. This is my query:
explain
SELECT DISTINCT profiles.hoofdrubriek, profiles.plaats, profiles.bedrijfsnaam, profiles.gemeente, profiles.bedrijfsslogan, profiles.straatnaam, profiles.huisnummer, profiles.postcode, profiles.telefoonnummer, profiles.fax, profiles.email, profiles.website, profiles.bedrijfslogo
FROM profiles
LEFT JOIN profile_subrubriek ON profiles.ID=profile_subrubriek.profile_id
LEFT JOIN rubrieken ON profile_subrubriek.subrubriek_id=rubrieken.ID
WHERE (
rubrieken.rubriek = 'Pedicurepraktijken' OR
profiles.hoofdrubriek = 'Pedicurepraktijken'
)
ORDER BY profiles.grade DESC, profiles.bedrijfsnaam
The 'OR' operator in this piece of the query is causing troubles:
rubrieken.rubriek = 'Pedicurepraktijken' OR profiles.hoofdrubriek = 'Pedicurepraktijken'
I have indexes applied on all my tables which function properly if I take out one of the two pieces of the above line of code. Combining them with an OR operator causes it to break down and it refuses to use the index I have applied on the 'hoofdrubriek' column in my profiles table. Below the layouts of my relevant tables:
CREATE TABLE `profiles` (
`ID` varchar(255) NOT NULL DEFAULT '',
......
`hoofdrubriek` varchar(255) DEFAULT NULL,
...
`timestamp` datetime DEFAULT NULL,
`meerderevestigingen` varchar(255) NOT NULL,
`grade` int(5) NOT NULL,
PRIMARY KEY (`ID`),
KEY `IDX_TIMESTAMP` (`timestamp`),
KEY `IDX_NIEUW` (`nieuw`),
KEY `IDX_HOOFDRUBRIEK` (`hoofdrubriek`),
KEY `bedrijfsnaam` (`bedrijfsnaam`),
KEY `grade` (`grade`),
KEY `gemeente` (`gemeente`),
KEY `plaats` (`plaats`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
CREATE TABLE `rubrieken` (
`ID` mediumint(9) NOT NULL AUTO_INCREMENT,
`rubriek` varchar(255) NOT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `rubriek` (`rubriek`)
) ENGINE=MyISAM AUTO_INCREMENT=1905 DEFAULT CHARSET=utf8
CREATE TABLE `profile_subrubriek` (
`profile_id` varchar(20) NOT NULL,
`subrubriek_id` mediumint(9) NOT NULL,
PRIMARY KEY (`subrubriek_id`,`profile_id`),
KEY `profile_id` (`profile_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
Of course I could solve the problem with a UNION DISTICT, thus combining the two different queries, but I don't think that should be the way to go..
Well, if an or
is causing a problem, then the simplest solution is to break the query into two pieces and put them together using union
(in your case, because of the distinct
). Rectifying the where
clause using an index is going to probably be impossible, because it references two different columns:
SELECT p.hoofdrubriek, p.plaats, p.bedrijfsnaam, p.gemeente, p.bedrijfsslogan, profiles.straatnaam,
p.huisnummer, profiles.postcode, p.telefoonnummer, p.fax, p.email, p.website, p.bedrijfslogo,
p.grade
FROM profiles p
LEFT JOIN profile_subrubriek ON p.ID=profile_subrubriek.profile_id
LEFT JOIN rubrieken ON profile_subrubriek.subrubriek_id=rubrieken.ID
WHERE rubrieken.rubriek = 'Pedicurepraktijken'
union
SELECT p.hoofdrubriek, p.plaats, p.bedrijfsnaam, p.gemeente, p.bedrijfsslogan, profiles.straatnaam,
p.huisnummer, profiles.postcode, p.telefoonnummer, p.fax, p.email, p.website, p.bedrijfslogo,
p.grade
FROM profiles p
LEFT JOIN profile_subrubriek ON p.ID=profile_subrubriek.profile_id
LEFT JOIN rubrieken ON profile_subrubriek.subrubriek_id=rubrieken.ID
WHERE p.hoofdrubriek = 'Pedicurepraktijken'
ORDER BY grade DESC, bedrijfsnaam;
I added grade
into the select
clause so it could be used by the order by
.
I think Gordon is right about using UNION, but you can make the UNION much more efficient:
In the first query below, since you are only referring to the profiles
table, you can remove the joins, they will only serve to cause duplicates that subsequently need to be removed. Then in the second, you can change the JOINs from OUTER to INNER, since you are referring to a field in the outer most table in the where clause you are stating there has to be a match. Then by adding a clause to remove values picked up by the first part of the union you will have less records to sort and remove duplicates from.
SELECT profiles.hoofdrubriek,
profiles.plaats,
profiles.bedrijfsnaam,
profiles.gemeente,
profiles.bedrijfsslogan,
profiles.straatnaam,
profiles.huisnummer,
profiles.postcode,
profiles.telefoonnummer,
profiles.fax,
profiles.email,
profiles.website,
profiles.bedrijfslogo,
profiles.grade
FROM profiles
WHERE profiles.hoofdrubriek = 'Pedicurepraktijken'
UNION
SELECT profiles.hoofdrubriek,
profiles.plaats,
profiles.bedrijfsnaam,
profiles.gemeente,
profiles.bedrijfsslogan,
profiles.straatnaam,
profiles.huisnummer,
profiles.postcode,
profiles.telefoonnummer,
profiles.fax,
profiles.email,
profiles.website,
profiles.bedrijfslogo,
profiles.grade
FROM profiles
INNER JOIN profile_subrubriek
ON profiles.ID=profile_subrubriek.profile_id
INNER JOIN rubrieken
ON profile_subrubriek.subrubriek_id=rubrieken.ID
WHERE rubrieken.rubriek = 'Pedicurepraktijken'
AND profiles.hoofdrubriek != 'Pedicurepraktijken'
ORDER BY grade DESC, bedrijfsnaam;
My knowledge of the inner workings of the MySQL optimisier is hazy to say the least, but my understanding is that the underlying cause is that MySQL does not use the index because it needs to scan the whole table anyway to check for the other predicate (rubrieken.rubriek = 'Pedicurepraktijken'
). I think you are expecting that the optimiser will do implicitly is what the UNION does explicitly. (I think) because of the OUTER JOINs and the OR, the optimiser is unable to accurately determine how many rows of profiles
it is going to have to read to find a match in rubrieken
OR in profiles, it cannot determine accurately if an index seek will be more efficient than a table scan, and opts for the table scan.
This is not unique to MySQL, it is not uncommon accross all DMBS for it to be more efficient to use UNION
instead of OR
.
Rearranging the queries as I have gives the optimiser a much better chance of using the right indexes (I suspect without the USE INDEX
hint too, but I haven't tested).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With