In MySQL, how to JOIN two very large tables which both have columns in the WHERE condition?

Tags:

I'm trying to determine the best general approach for querying against joined two tables that have a lot of data, where each table has a column in the where clause. Imagine a simple schema w/ two tables:

posts
 id (int)
 blog_id (int)
 published_date (datetime)
 title (varchar)
 body (text)

posts_tags 
 post_id (int)
 tag_id (int)

With the following indexes:

posts: [blog_id, published_date]
tags: [tag_id, post_id]

We want to SELECT the 10 most recent posts on a given blog that were tagged with "foo". For the sake of this discussion, assume the blog has 10 million posts, and 1 million of those have been tagged with "foo". What is the most efficient way to query for this data?

The naive approach would be to do this:

 SELECT 
  id, blog_id, published_date, title, body
 FROM 
  posts p
 INNER JOIN
  posts_tags pt 
  ON pt.post_id = p.id
 WHERE
  p.blog_id = 1
  AND pt.tag_id = 1
 ORDER BY
  p.published_date DESC
 LIMIT 10

MySQL will use our indexes, but will still end up scanning millions of records. Is there a more efficient way to retrieve this data w/o denormalizing the schema?

440

asked Sep 07 '10 21:09

Newt

1 Answers

Any filters you want to do on a joined table should go in the join. Technically, the WHERE clause should contain only conditions that require more than 1 table or the primary table. While it may not speed up all queries, it assures MySQL optimizes the query properly.

FROM 
posts p
INNER JOIN
posts_tags pt 
ON pt.post_id = p.id
    AND pt.tag_id = 1
WHERE
p.blog_id = 1

116

answered Sep 17 '22 16:09

Brent Baisley

Related questions
                            
                                What is the efficient way to make a permission system?
                            
                                not case sensitive query in mysql
                            
                                case-insensitive search of MySQL?
                            
                                MySQL Fulltext Search, increase minimum character
                            
                                Date time exception in coldfusion query in cfc and mySQL
                            
                                Hibernate disable Query Cache
                            
                                mysql vs ini files?
                            
                                PHP, MySQL, spatial data and design
                            
                                Is there a shortcut to normalizing a table where the columns=rows?
                            
                                What is the proper way to ensure EntityManager connections are closed?
                            
                                PHP: What do you use to write it? [duplicate]
                            
                                Python and mySQLdb error: OperationalError: (1054, "Unknown column in 'where clause'")
                            
                                MySql - Inserting multiple rows with a joined subquery?
                            
                                How to get difference between 2 columns
                            
                                How to count similar interests in MySQL
                            
                                MySQL and polish words
                            
                                Find closest match for misspelled city names?
                            
                                mySQL query to count number of unique users posting to database
                            
                                Creating indexes for 'OR' operator in queries
                            
                                How to optimize a 'col = col + 1' UPDATE query that runs on 100,000+ records?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In MySQL, how to JOIN two very large tables which both have columns in the WHERE condition?

Tags:

join

optimization

mysql

Newt

People also ask

1 Answers

Brent Baisley

Recent Activity

Donate For Us