I have two tables: 'movies' and 'users'. There's an n:m relationship between those, describing what movies a user has seen. This is described with a table 'seen' Now i want to find out for a given user, all the movies he has not seen. My current solution is like this:
SELECT * FROM movies WHERE movies.id NOT IN ( SELECT seen.movie_id FROM seen WHERE seen.user_id=123 )
This works fine but does not seem to scale very well. Is there a better approach to this?
Click on Reverse Engineer option somewhere you find under the tools or Database menu. It will ask you to choose the tables. Either you select the tables you want to understand or choose the entire DB. It will generate a diagram with relationships.
To counts all of the rows in a table, whether they contain NULL values or not, use COUNT(*). That form of the COUNT() function basically returns the number of rows in a result set returned by a SELECT statement.
Getting MySQL row count of all tables in a specific database First, get all table names in the database. Second, construct an SQL statement that includes all SELECT COUNT(*) FROM table_name statements for all tables separated by UNION . Third, execute the SQL statement using a prepared statement.
Here's the syntax to select top N rows in MySQL. In the above statement, we list the columns column1, column2, … that you want to select in your query. Also, you need to specify LIMIT n after the table name, where n is the number of rows you want to select. The above query will select top n records in your table.
Here's a typical way to do this query without using the subquery method you showed. This may satisfy @Godeke's request to see a join-based solution.
SELECT * FROM movies m LEFT OUTER JOIN seen s ON (m.id = s.movie_id AND s.user_id = 123) WHERE s.movie_id IS NULL;
However, in most brands of database this solution can perform worse than the subquery solution. It's best to use EXPLAIN to analyze both queries, to see which one will do better given your schema and data.
Here's another variation on the subquery solution:
SELECT * FROM movies m WHERE NOT EXISTS (SELECT * FROM seen s WHERE s.movie_id = m.id AND s.user_id=123);
This is a correlated subquery, which must be evaluated for every row of the outer query. Usually this is expensive, and your original example query is better. On the other hand, in MySQL "NOT EXISTS
" is often better than "column NOT IN (...)
"
Again, you must test each solution and compare the results to be sure. It's a waste of time to choose any solution without measuring performance.
Not only does your query work, it's the right approach to the problem as stated. Perhaps you can find a different way to approach the problem? A simple LIMIT on your outer select should be very fast even for large tables, for instance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With