I have a NewsStories tables which I'm left joining with some related tables. Each News story can have multiple images, categories and addresses. So the query is essentially:
SELECT * FROM NewStories
LEFT JOIN Images ON Newstories.id=Images.story_id
LEFT JOIN Categories ON NewsStories.id=Categories.story_id
LEFT JOIN Addresses ON NewsStories.id=Addresses.story_id
WHERE ...
There are usually a few images and addresses per story, and 1 or 2 categories. The NewsStories table has about 10,000 articles.
The trouble is that the performance is rather slow (in the order of 15-20 seconds, although it does vary quite a bit and sometimes drops to as low as 5 seconds).
I was wondering if there's a better way of organizing the query to speed it up (I'm quite new to SQL).
In particular it seems quite wasteful that the number of rows for a given story is multiplied by the number of images times the number of addresses times the number of categories.
I'm essentially trying to reconstruct the properties of the News story into a single object which I can manipulate in a frontend.
Here's the explain (apologies if the formatting doesn't come out correctly). I'm guessing I'm not indexing Addresses properly if it's "Using where". Is that correct?
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Addresses ALL NULL NULL NULL NULL 6640 Using where
1 SIMPLE NewsStories eq_ref PRIMARY PRIMARY 767 NewsStories.Addresses.story_id 1 Using where
1 SIMPLE Images ref PRIMARY PRIMARY 767 NewsStories.NewsStories.id 1 Using index
1 SIMPLE Categories ref PRIMARY PRIMARY 767 NewsStories.NewStories.id 1
IS LEFT join slower than join? The LEFT JOIN query is slower than the INNER JOIN query because it's doing more work.
Yes, indeed! You can use multiple LEFT JOINs in one query if needed for your analysis.
A LEFT JOIN is absolutely not faster than an INNER JOIN . In fact, it's slower; by definition, an outer join ( LEFT JOIN or RIGHT JOIN ) has to do all the work of an INNER JOIN plus the extra work of null-extending the results.
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name [index_type] ON tbl_name (index_col_name,...) [index_type]
index_col_name: col_name [(length)] [ASC | DESC]
index_type: USING {BTREE | HASH}
Check if you really have to select every column in all of the tables? If not, make sure that you only select the columns that you need, avoid using select*
Double check if you really need LEFT JOINS, if no, use INNER JOINs.
If performance is still an issue after you're done tweaking your query, consider denormalizing your schema to eliminate joins
You may also want to consider reducing the load on the database by using caching applications like sphinxsearch and memcached
Check none of your joins are to views rather than actual tables
references:
http://www.sphinxsearch.com
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With