I have realized a very strange behavior of MySQL that I have no explanation for.
This is a not overly complex query:
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);
The columns Enricher3State, Enricher4State, Enricher5State, Enricher9State do have an index and are of the datatype int(11).
Now I have tried to change these Enricher[x]State to a string:
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);
Every common sense would say that the string variant should perform the same, or slower, because the datatype of the column is integer!
But apparently this is not the case!
Query with integer notation (first one): 7.23048825s
Query with string notation (last one): 5.22188450s
As you can see, there is a huge performance difference, even though the query cost is the same in both cases.
I have absolutely no clue how this difference could happen - and if this means, I should change all queries in my project using the string notation...
I am using MySQL version 5.7.10
According to your comments I have deactivated all services that write or read to the database and repeated the experiment.
A) The integer notation:
SET profiling=0;
SET profiling=1;
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);
SHOW PROFILES;
Execution time of each query:
B) The string notation:
SET profiling=0;
SET profiling=1;
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);
SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);
SHOW PROFILES;
Execution time:
As you can clearly see, the string notation is still faster!
The same behavior was also detected by other developers of my team, so I could exclude temporary stupidity of myself…
Since the fields are indexed and you have OR condition and query does have integer constant as the condition, MySQL may spend time on cross index join calculation and then do table scan and with string constant MySQL does not do index considerations and just does table scan.
It is the case when having indices on many fields that are used in OR condition is not advantage but extra work for MySQL.
The OR condition does not grant required indexing of participating fields, quite often having indices on the "1,2,3,4" fields is bad for the table. These fields should be taken out to a separate table.
Added: Run EXPLAIN and if you see indices "1,2,3,4" fields listed for considered keys, that is what MySQL is spending the time on.
Considering the answer from Sergiy Tytarenko, I have deleted the indices on the Enricher[x]State columns.
Execution time for integer notation:
Execution time for string notation:
The execution time is about the same now.
So indeed, when having multiple indices on columns connected with OR, care should be taken.
It seems I have accidentally discovered a nice workaround (except deleting the index), by making a string out of an integer...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With