I wonder if using CASE ... WHEN ... THEN expression in MySQL queries
has negative effect on performance?
Instead of using CASE expression (for example inside your UPDATE query)
you always have possibility to make if else statement in your program
written in php, python, perl, java, ... to choose wich query to send, for example (in pseudocode):
prepareStatement(
"UPDATE t1 SET c1=c1+1, msg=CASE (@v:=?) WHEN '' THEN msg ELSE @v END"
);
setStatementParameter(1, message);
or insead:
if (message == "") {
prepareStatement("UPDATE t1 SET c1=c1+1");
} else {
prepareStatement("UPDATE t1 SET c1=c1+1, msg=?");
setStatementParameter(1, message);
}
(c1 here needed just to show that something happens in both cases)
What way of doing it has better performance?
And how much the performance penalty is?
The MySQL CASE statement is faster in comparison to PHP if statement. The PHP if statement takes too much time because it loads data and then process while CASE statement does not.
Table and database names are stored on disk using the lettercase specified in the CREATE TABLE or CREATE DATABASE statement, but MySQL converts them to lowercase on lookup. Name comparisons are not case-sensitive. This works only on file systems that are not case-sensitive!
The MySQL CASE StatementThe CASE statement goes through conditions and returns a value when the first condition is met (like an if-then-else statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.
Pretty much all per-row functions will have an impact on performance, the only real question is: "Is the impact small enough to not worry about?".
This is something you should discover by measuring rather than guessing. Database administration is only a set-and-forget activity if neither your data nor your queries ever change. Otherwise, you should be periodically monitoring performance to ensure no problems occur.
By "small enough" in the above comments, I mean, you probably needn't worry about the performance impact of something like:
select * from friends where lowercase(lastname) = "smith"
if you only have three friends.
The impact of these things becomes more serious as the table increases in size. For example, if you have one hundred million customers and you want to find all the ones likely to be computer-related, you wouldn't want to try:
select name from customers where lowercase(name) like '%comp%'
That's likely to bring your DBAs down on you like a ton of bricks.
One way we've fixed this in the past is to introduce redundancy into the data. Using that first example, we would add an extra column called lowerlastname
and populate it with the lowercase value of lastname
. Then index that for search purposes and your select
statements become blindingly fast, as they should be.
And what does that do to our much loved 3NF, I hear you ask? The answer is "not much", if you know what you're doing :-)
You can set up the database so that this new column is populated by an insert/update trigger, to maintain data consistency. It's perfectly acceptable to break 3NF for performance reasons, provided you understand and mitigate the consequences.
Similarly, that second query could have an insert/update trigger that populated a new indexed column name_contains_comp
whenever an entry was updated or inserted that contained the relevant text.
Since most databases are read far more often than they're written, this moves the cost of the calculation to the insert/update, effective amortising it across all select operations. The query would then be:
select name from customers where name_contains_comp = 'Y'
Again, you'll find the query blindingly fast at the minor cost of slightly slower inserts and updates.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With