It's accepted that searching a table on an int column is faster than on a string column (say varchar).
However, if I have a Shirt table with a Color column, would it be more performant to create a Color table with the primary key on that table being the foreign key on the Shirt table? Would the join negate the performance advantage of having the value in the Color column on Shirt being an int instead of a string value such as "Green" when searching for green Shirts?
In case there are a large number of rows in the tables and there is an index to use, INNER JOIN is generally faster than OUTER JOIN.
TLDR: The most efficient join is also the simplest join, 'Relational Algebra'. If you wish to find out more on all the methods of joins, read further. Relational algebra is the most common way of writing a query and also the most natural way to do so.
Using outer join limits the database optimization options which typically results in slower SQL execution. DISTINCT and UNION should be used only if it is necessary. DISTINCT and UNION operators cause sorting, which slows down the SQL execution. Use UNION ALL instead of UNION, if possible, as it is much more efficient.
There is not a "better" or a "worse" join type. They have different meaning and they must be used depending on it. In your case, you probably do not have employees with no work_log (no rows in that table), so LEFT JOIN and JOIN will be equivalent in results.
If I understand correctly, you are asking which of these two queries would be faster:
SELECT * FROM shirt where color = 'Green'
vs
SELECT shirt.* FROM shirt s INNER JOIN colors c
ON s.colorid = c.colorid
WHERE c.color = 'Green'
It depends a little bit on the database (well ... maybe a lot depending on if it optimizes correctly, which most if not all should), but the lookup in the color table should be negligible and then the remaining execution could use the integer lookup value and should be faster. The bulk of the processing ultimately would be equivalent to SELECT * from shirt WHERE colorid=N
. However, I suspect that you would not notice a difference in speed unless the table was quite large. The decision should probably be based on which design makes the most sense (probably the normalized one).
Beyond performance, creating a separate Color table makes your design better normalized. So, some day in the future, when someone decides that "Dark Blue" should now be called "Navy Blue", you update 1 row in your Color table vs. updating many rows in your Shirt table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With