Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overhead for MySQL SELECTS - Better to Use One, or Many In Sequence

Is there an appreciable performance difference between having one SELECT foo, bar, FROM users query that returns 500 rows, and 500 SELECT foo, bar, FROM users WHERE id = x queries coming all at once?

In a PHP application I'm writing, I'm trying to choose between a writing clear, readable section of code that would produce about 500 SELECT statements; or writing a it in an obscure, complex way that would use only one SELECT that returns 500 rows.

I would prefer the way that uses clear, maintainable code, but I'm concerned that the connection overhead for each of the SELECTs will cause performance problems.

Background info, in case it's relevant: 1) This is a Drupal module, coded in PHP 2) The tables in question get very few INSERTs and UPDATEs, and are rarely locked 3) SQL JOINs aren't possible for reasons not relevant to the question

Thanks!

like image 828
anschauung Avatar asked Dec 14 '22 01:12

anschauung


2 Answers

It's almost always faster to do one big batch SELECT and parse the results in your application code than doing a massive amount of SELECTs for one row. I would recommend that you implement both and profile them, though. Always strive to minimize the number of assumptions you have to make.

like image 99
Emil H Avatar answered Jan 17 '23 14:01

Emil H


I would not worry about the connection overhead of mysql queries too much, especially if you are not closing the connection between every query. Consider that if your query creates a temporary table, you've already spent more time in the query than the overhead of the query took.

I love doing a complex SQL query, personally, but I have found that the size of the tables, mysql query cache and query performance of queries that need to do range checking (even against an index) all make a difference.

I suggest this:

1) Establish the simple, correct baseline. I suspect this is the zillion-query approach. This is not wrong, and very likely helfully correct. Run it a few times and watch your query cache and application performance. The ability to keep your app maintainable is very important, especially if you work with other code maintainers. Also, if you're querying really large tables, small queries will maintain scalability.

2) Code the complex query. Compare the results for accuracy, and then the time. Then use EXPECT on the query to see what the rows scanned are. I have often found that if I have a JOIN, or a WHERE x != y, or a condition that creates a temporary table, the query performance could get pretty bad, especially if I'm in a table that's always getting updated. However, I've also found that a complex query might not be correct, and also that a complex query can more easily break as an application grows. Complex queries typically scan larger sets of rows, often creating temporary tables and invoke using where scans. The larger the table, the more expensive these get. Also, you might have team considerations where complex queries don't suit your team's strengths.

3) Share the results with your team.

Complex queries are less likely to hit the mysql query cache, and if they are large enough, don't cache them. (You want to save the mysql query cache for frequently hit queries.) Also, query where predicates that have to scan the index will not do as well. (x != y, x > y, x < y). Queries like SELECT foo, bar FROM users WHERE foo != 'g' and mumble < '360' end up doing scans. (The cost of query overhead could be negligible in that case.)

Small queries can often complete without creating temporary tables just by getting all values from the index, so long as the fields you're selecting and predicating on are indexed. So the query performance of SELECT foo, bar FROM users WHERE id = x is really great (esp if columns foo and bar are indexed like, aka alter table users add index ix_a ( foo, bar );.)

Other good ways to increase performance in your application would be to cache those small query results in the application (if appropriate), or doing batch jobs of a materialized view query. Also, consider memcached or some features found in XCache.

like image 21
memnoch_proxy Avatar answered Jan 17 '23 16:01

memnoch_proxy