Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL: Finding rows that don't take part in a relationship

I have two tables: 'movies' and 'users'. There's an n:m relationship between those, describing what movies a user has seen. This is described with a table 'seen' Now i want to find out for a given user, all the movies he has not seen. My current solution is like this:

SELECT * FROM movies  WHERE movies.id NOT IN (      SELECT seen.movie_id       FROM seen       WHERE seen.user_id=123 ) 

This works fine but does not seem to scale very well. Is there a better approach to this?

like image 872
tliff Avatar asked Feb 12 '09 23:02

tliff


People also ask

How do I view relationships in MySQL?

Click on Reverse Engineer option somewhere you find under the tools or Database menu. It will ask you to choose the tables. Either you select the tables you want to understand or choose the entire DB. It will generate a diagram with relationships.

Which is used to retrieve No of rows in MySQL table?

To counts all of the rows in a table, whether they contain NULL values or not, use COUNT(*). That form of the COUNT() function basically returns the number of rows in a result set returned by a SELECT statement.

How do I find rows in MySQL?

Getting MySQL row count of all tables in a specific database First, get all table names in the database. Second, construct an SQL statement that includes all SELECT COUNT(*) FROM table_name statements for all tables separated by UNION . Third, execute the SQL statement using a prepared statement.

How do I select n rows in MySQL?

Here's the syntax to select top N rows in MySQL. In the above statement, we list the columns column1, column2, … that you want to select in your query. Also, you need to specify LIMIT n after the table name, where n is the number of rows you want to select. The above query will select top n records in your table.


2 Answers

Here's a typical way to do this query without using the subquery method you showed. This may satisfy @Godeke's request to see a join-based solution.

SELECT *  FROM movies m  LEFT OUTER JOIN seen s  ON (m.id = s.movie_id AND s.user_id = 123) WHERE s.movie_id IS NULL; 

However, in most brands of database this solution can perform worse than the subquery solution. It's best to use EXPLAIN to analyze both queries, to see which one will do better given your schema and data.

Here's another variation on the subquery solution:

SELECT *  FROM movies m WHERE NOT EXISTS (SELECT * FROM seen s                    WHERE s.movie_id = m.id                      AND s.user_id=123); 

This is a correlated subquery, which must be evaluated for every row of the outer query. Usually this is expensive, and your original example query is better. On the other hand, in MySQL "NOT EXISTS" is often better than "column NOT IN (...)"

Again, you must test each solution and compare the results to be sure. It's a waste of time to choose any solution without measuring performance.

like image 124
Bill Karwin Avatar answered Sep 22 '22 07:09

Bill Karwin


Not only does your query work, it's the right approach to the problem as stated. Perhaps you can find a different way to approach the problem? A simple LIMIT on your outer select should be very fast even for large tables, for instance.

like image 20
dwc Avatar answered Sep 21 '22 07:09

dwc