How costly are JOINs in SQL? And/or, what's the trade off between performance and normalization?

Tags:

I've found a similar thread but it doesn't really capture the essence of what I'm trying to ask - so I've created a new thread.

I know there is a trade-off between normalization and performance, and I'm wondering what's the best practice for drawing that line? In my particular situation, I have a messaging system that has three distinct tables: messages_threads (overarching message holder), messages_recipients (who is involved), and messages_messages (the actual messages + timestamps).

In order to return the "inbox" view, I have to left join the messages_threads table, users table, and pictures tables to the messages_recipients tables in order to get the information to populate the view (profile picture, sender name, thread id)... and I've still got add a join to messages to retrieve the text from the last message in order to display a "preview" of the last message to the user.

My question is: How costly are JOINS in SQL to performance? I could, for instance, store the sender's name (which I have to left join from users to retrieve) under a field in the messages_threads table called "sendername" - but in terms of normalization I've always been taught to avoid data redundancy?

Where do you draw the line? Or am I overestimating how performance-hampering SQL joins are?

389

asked Apr 24 '11 22:04

Walker

1 Answers

The best practice is to always start with 3NF, and then only consider denormalistion if you find a specific performance problem.

Performance is just one of the issues you have to deal with with databases. By duplicating data, you run the risk of allowing inconsistent data to be in your database, thus nullifying one of the core principles of relational databases, consistency (the C in ACID) ^a.

Yes, joins have a cost, there's no getting around that. However, the cost is usually a lot less than you'd think, and can often be swamped by other factors like network transmission times. By making sure the relevant columns are indexed properly, you can avoid a lot of those costs.

And, remember the optimisation mantra: measure, don't guess! And measure in a production-like environment. And keep measuring (and tuning) periodically - optimisation is only a set and forget operation if your schema and data never change (very unlikely).

^a) Reversion for performance can usually be made safe by using triggers to maintain consistency. This will, of course, slow down your updates, but may still let your selects run faster.

199

answered Sep 22 '22 07:09

paxdiablo

Related questions
                            
                                Use of greater and less than signs together in mysql query <>
                            
                                MySQL Join Same Table
                            
                                Pdo connection without database name?
                            
                                Second SELECT query if first SELECT returns 0 rows
                            
                                Error (1142) SELECT command denied on session_variables table with mysqldump
                            
                                MYSQL query performs very slow
                            
                                MySQL flush tables - Current database or every database?
                            
                                SQL update one column from another column in another table
                            
                                MySQL group-by very slow
                            
                                MySQL List All Duplicates [duplicate]
                            
                                How to take mysql dump of selected columns of a table
                            
                                Doctrine many to one relationship persist operation
                            
                                Insert using PreparedStatement. How do I auto-increment the ID?
                            
                                MySQL - Is it possible to use LIKE on all columns in a table?
                            
                                java.sql.SQLException: Field 'supplier_id' doesn't have a default value
                            
                                check current user in mysql command line
                            
                                How to create and insert a JSON object using MySQL queries?
                            
                                How can I see raw bytes stored in a MySQL column?
                            
                                multiple values in mysql variable
                            
                                How do I cast a type to a bigint in MySQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How costly are JOINs in SQL? And/or, what's the trade off between performance and normalization?

Tags:

performance

sql

join

mysql

Walker

People also ask

1 Answers

paxdiablo

Recent Activity

Donate For Us