Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL query no using indexes properly

Having a problem with one of my SQL queries. This is my query:

explain 
SELECT DISTINCT profiles.hoofdrubriek, profiles.plaats, profiles.bedrijfsnaam, profiles.gemeente, profiles.bedrijfsslogan, profiles.straatnaam, profiles.huisnummer, profiles.postcode, profiles.telefoonnummer, profiles.fax, profiles.email, profiles.website, profiles.bedrijfslogo 
FROM profiles 
LEFT JOIN profile_subrubriek ON profiles.ID=profile_subrubriek.profile_id 
LEFT JOIN rubrieken ON profile_subrubriek.subrubriek_id=rubrieken.ID  
WHERE (
    rubrieken.rubriek = 'Pedicurepraktijken' OR 
    profiles.hoofdrubriek = 'Pedicurepraktijken'
) 
ORDER BY profiles.grade DESC, profiles.bedrijfsnaam

The 'OR' operator in this piece of the query is causing troubles:

rubrieken.rubriek = 'Pedicurepraktijken' OR profiles.hoofdrubriek = 'Pedicurepraktijken'

I have indexes applied on all my tables which function properly if I take out one of the two pieces of the above line of code. Combining them with an OR operator causes it to break down and it refuses to use the index I have applied on the 'hoofdrubriek' column in my profiles table. Below the layouts of my relevant tables:

CREATE TABLE `profiles` (
 `ID` varchar(255) NOT NULL DEFAULT '',
 ......
 `hoofdrubriek` varchar(255) DEFAULT NULL,
...


 `timestamp` datetime DEFAULT NULL,
 `meerderevestigingen` varchar(255) NOT NULL,
 `grade` int(5) NOT NULL,
 PRIMARY KEY (`ID`),
 KEY `IDX_TIMESTAMP` (`timestamp`),
 KEY `IDX_NIEUW` (`nieuw`),
 KEY `IDX_HOOFDRUBRIEK` (`hoofdrubriek`),
 KEY `bedrijfsnaam` (`bedrijfsnaam`),
 KEY `grade` (`grade`),
 KEY `gemeente` (`gemeente`),
 KEY `plaats` (`plaats`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8


CREATE TABLE `rubrieken` (
 `ID` mediumint(9) NOT NULL AUTO_INCREMENT,
 `rubriek` varchar(255) NOT NULL,
 PRIMARY KEY (`ID`),
 UNIQUE KEY `rubriek` (`rubriek`)
) ENGINE=MyISAM AUTO_INCREMENT=1905 DEFAULT CHARSET=utf8


CREATE TABLE `profile_subrubriek` (
 `profile_id` varchar(20) NOT NULL,
 `subrubriek_id` mediumint(9) NOT NULL,
 PRIMARY KEY (`subrubriek_id`,`profile_id`),
 KEY `profile_id` (`profile_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

Of course I could solve the problem with a UNION DISTICT, thus combining the two different queries, but I don't think that should be the way to go..

like image 511
user2704687 Avatar asked Nov 01 '22 10:11

user2704687


2 Answers

Well, if an or is causing a problem, then the simplest solution is to break the query into two pieces and put them together using union (in your case, because of the distinct). Rectifying the where clause using an index is going to probably be impossible, because it references two different columns:

SELECT p.hoofdrubriek, p.plaats, p.bedrijfsnaam, p.gemeente, p.bedrijfsslogan, profiles.straatnaam, 
       p.huisnummer, profiles.postcode, p.telefoonnummer, p.fax, p.email, p.website, p.bedrijfslogo, 
       p.grade
FROM profiles p 
LEFT JOIN profile_subrubriek ON p.ID=profile_subrubriek.profile_id 
LEFT JOIN rubrieken ON profile_subrubriek.subrubriek_id=rubrieken.ID  
WHERE rubrieken.rubriek = 'Pedicurepraktijken' 
union 
SELECT p.hoofdrubriek, p.plaats, p.bedrijfsnaam, p.gemeente, p.bedrijfsslogan, profiles.straatnaam, 
       p.huisnummer, profiles.postcode, p.telefoonnummer, p.fax, p.email, p.website, p.bedrijfslogo, 
       p.grade
FROM profiles p 
LEFT JOIN profile_subrubriek ON p.ID=profile_subrubriek.profile_id 
LEFT JOIN rubrieken ON profile_subrubriek.subrubriek_id=rubrieken.ID  
WHERE p.hoofdrubriek = 'Pedicurepraktijken'
ORDER BY grade DESC, bedrijfsnaam;

I added grade into the select clause so it could be used by the order by.

like image 162
Gordon Linoff Avatar answered Nov 09 '22 16:11

Gordon Linoff


I think Gordon is right about using UNION, but you can make the UNION much more efficient:

In the first query below, since you are only referring to the profiles table, you can remove the joins, they will only serve to cause duplicates that subsequently need to be removed. Then in the second, you can change the JOINs from OUTER to INNER, since you are referring to a field in the outer most table in the where clause you are stating there has to be a match. Then by adding a clause to remove values picked up by the first part of the union you will have less records to sort and remove duplicates from.

SELECT  profiles.hoofdrubriek, 
        profiles.plaats, 
        profiles.bedrijfsnaam, 
        profiles.gemeente, 
        profiles.bedrijfsslogan, 
        profiles.straatnaam, 
        profiles.huisnummer, 
        profiles.postcode, 
        profiles.telefoonnummer, 
        profiles.fax, 
        profiles.email, 
        profiles.website, 
        profiles.bedrijfslogo,
        profiles.grade
FROM    profiles   
WHERE   profiles.hoofdrubriek = 'Pedicurepraktijken'
UNION
SELECT  profiles.hoofdrubriek, 
        profiles.plaats, 
        profiles.bedrijfsnaam, 
        profiles.gemeente, 
        profiles.bedrijfsslogan, 
        profiles.straatnaam, 
        profiles.huisnummer, 
        profiles.postcode, 
        profiles.telefoonnummer, 
        profiles.fax, 
        profiles.email, 
        profiles.website, 
        profiles.bedrijfslogo,
        profiles.grade
FROM    profiles 
        INNER JOIN profile_subrubriek 
            ON profiles.ID=profile_subrubriek.profile_id 
        INNER JOIN rubrieken 
            ON profile_subrubriek.subrubriek_id=rubrieken.ID  
WHERE   rubrieken.rubriek = 'Pedicurepraktijken' 
AND     profiles.hoofdrubriek != 'Pedicurepraktijken'
ORDER BY grade DESC, bedrijfsnaam;

My knowledge of the inner workings of the MySQL optimisier is hazy to say the least, but my understanding is that the underlying cause is that MySQL does not use the index because it needs to scan the whole table anyway to check for the other predicate (rubrieken.rubriek = 'Pedicurepraktijken'). I think you are expecting that the optimiser will do implicitly is what the UNION does explicitly. (I think) because of the OUTER JOINs and the OR, the optimiser is unable to accurately determine how many rows of profiles it is going to have to read to find a match in rubrieken OR in profiles, it cannot determine accurately if an index seek will be more efficient than a table scan, and opts for the table scan.

This is not unique to MySQL, it is not uncommon accross all DMBS for it to be more efficient to use UNION instead of OR.

Rearranging the queries as I have gives the optimiser a much better chance of using the right indexes (I suspect without the USE INDEX hint too, but I haven't tested).

like image 43
GarethD Avatar answered Nov 09 '22 16:11

GarethD