Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Joins vs Single Table : Performance Difference?

I am trying to stick to the practice of keeping the database normalized, but that leads to the need to run multiple join queries. Is there a performance degradation if many queries use joins vs having a call to a single table that might contain redundant data?

like image 688
zsharp Avatar asked Jan 25 '09 06:01

zsharp


People also ask

Do joins affect performance?

Even though the join order has no impact on the final result, it still affects performance. The optimizer will therefore evaluate all possible join order permutations and select the best one. That means that just optimizing a complex statement might become a performance problem.

Which is faster where in or join?

If the joining column is UNIQUE and marked as such, both these queries yield the same plan in SQL Server . If it's not, then IN is faster than JOIN on DISTINCT .

Do joins slow down query?

Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.

Are table joins slow?

The problem is joins are relatively slow, especially over very large data sets, and if they are slow your website is slow. It takes a long time to get all those separate bits of information off disk and put them all together again.


2 Answers

Keep the Database normalised UNTIL you have discovered a bottleneck. Then only after careful profiling should you denormalise.

In most instances, having a good covering set of indexes and up to date statistics will solve most performance and blocking issues without any denormalisation.

Using a single table could lead to worse performance if there are writes as well as reads against it.

like image 135
Mitch Wheat Avatar answered Sep 20 '22 21:09

Mitch Wheat


Michael Jackson (not that one) is famously believed to have said,

  • The First Rule of Program Optimization: Don't do it.
  • The Second Rule of Program Optimization – For experts only: Don't do it yet.

That was probably before RDBMSs were around, but I think he'd have extended the Rules to include them.

Multi-table SELECTs are almost always needed with a normalised data model; as is often the case with this kind of question, the "correct" answer to the "denormalise?" question depends on several factors.

DBMS platform.

The relative performance of multi- vs single-table queries is influenced by the platform on which your application lives: the level of sophistication of the query optimisers can vary. MySQL, for example, in my experience, is screamingly fast on single-table queries but doesn't optimise queries with multiple joins so well. This isn't a real issue with smaller tables (less than 10K rows, say) but really hurts with large (10M+) ones.

Data volume

Unless you're looking at tables in the 100K+ row region, there pretty much shouldn't be a problem. If you're looking at table sizes in the hundreds of rows, I wouldn't even bother thinking about indexing.

(De-)normalisation

The whole point of normalisation is to minimise duplication, to try to ensure that any field value that must be updated need only be changed in one place. Denormalisation breaks that, which isn't much of a problem if updates to the duplicated data are rare (ideally they should never occur). So think very carefully before duplicating anything but the most static data, Note that your database may grow significantly

Requirements/Constraints

What performance requirements are you trying to meet? Do you have fixed hardware or a budget? Sometimes a performance boost can be most easily - and even most cheaply - achieved by a hardware upgrade. What transaction volumes are you expecting? A small-business accounting system has a very different profile to, say, Twitter.

One last thought strikes me: if you denormalise enough, how is your database different from a flat file? SQL is superb for flexible data and multi-dimensional retieval, but it can be an order of magnitude (at least) slower than a straight sequential or fairly simply indexed file.

like image 32
Mike Woodhouse Avatar answered Sep 18 '22 21:09

Mike Woodhouse