Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting rows from one table using values gotten from another table MYSQL

Tags:

mysql

I have currently have 2 mysql tables in my db

Film and Film_Ratings_Report

The primary key for Film is filmid which is used to identify the film ratings in the Film_Ratings_Report table.

I would like to know if its possible using a MYSQL query only to search the ratings table and collect all film ids which fit a certain criteria then use the selected IDs to get the film titles from the Film table. Below is the MYSQL query Im using which isnt working:

SELECT * 
FROM film 
UNION SELECT filmid 
      FROM film_rating_report 
      WHERE rating = 'GE' 
      LIMIT 0,0

I am relatively green to MYSQL and would appreciate any help on this.

Thanks in Advance

like image 282
Stanley Ngumo Avatar asked Jul 01 '13 08:07

Stanley Ngumo


People also ask

How can I get data from one table to another table in MySQL?

If you want to copy data from one table to another in the same database, use INSERT INTO SELECT statement in MySQL. It's a very quick process to copy large amount data from a table and insert into the another table in same MySQL database.

How do I select multiple values from one table in MySQL?

Learn MySQL from scratch for Data Science and Analytics To select multiple values, you can use where clause with OR and IN operator.

How do I select data from one table is not in another table?

We can get the records in one table that doesn't exist in another table by using NOT IN or NOT EXISTS with the subqueries including the other table in the subqueries.


2 Answers

SELECT * FROM film WHERE id IN 
  (SELECT filmid FROM film_rating_report WHERE rating = 'GE');

should work

like image 158
SG 86 Avatar answered Oct 17 '22 22:10

SG 86


It seems you want a semi-join, e.g. a join where only data from one of the 2 joined tables are needed. In this case, all rows from film for which there is a matching row in film_rating_report that has the wanted condition (rating = 'GE').

This is not exactly equivalent to a usual join because even if there are 2 (or more) row in the second table (2 ratings of a film, both with 'GE'), we still want the film to be shown once, not twice (or more times) as it would be shown with a usual join.

There are various ways to write a semi-join and most popular are:

  • using an EXISTS correlated subquery (@Justin's answer):

    SELECT t1.* 
    FROM film t1 
    WHERE EXISTS (SELECT filmid 
                  FROM film_rating_report t2
                  WHERE t2.rating = 'GE'
                  AND t2.filmid = t1.id);
    
  • using an IN (uncorrelated) subquery (@SG 86's answer):
    (this should be used with extreme care as it may return unexpected results - or none at all - if the joining columns (the two filmid in this case) are Nullable)

    SELECT * 
    FROM film 
    WHERE id IN 
      ( SELECT filmid 
        FROM film_rating_report 
        WHERE rating = 'GE'
      );
    
  • using a usual JOIN with a GROUP BY to avoid the duplicate rows in the results (@Tomas' answer):
    (and note that this specific use of GROUP BY works in MySQL only and in recent versions of Postgres, if you ever want to write a similar query in other DBMS, you'll have to include all columns: GROUP BY f.filmid, f.title, f.director, ...)

    SELECT f.*
    FROM film AS f
        JOIN film_rating_report AS frr
             ON f.filmid = frr.filmid
    WHERE frr.rating = 'GE' 
    GROUP BY f.filmid ;
    
  • A variation on @Tomas'es answer, where the GROUP BY is done on a derived table and then the JOIN:

    SELECT f.*
    FROM film AS f
        JOIN 
            ( SELECT filmid
              FROM film_rating_report
              WHERE rating = 'GE'
              GROUP BY filmid
            ) AS frr
          ON f.filmid = frr.filmid ;
    

Which one to use, depends on the RDBMS and the specific version you are using (for example, IN subqueries should be avoided in most versions of MySQL as they may produce inefficient execution plans), your specific table sizes, distribution, indexes, etc.

I usually prefer the EXISTS solution but it never hurts to first test the various queries with the table sizes you have or expect to have in the future and try to find the best query-indexes combination for your case.


Addition: if there is a unique constraint on the film_rating_report (filmid, rating) combination, which means that no film will ever get two same ratings, or if there is an even stricter (but more plausible) unique constraint on film_rating_report (filmid) that means that every film has at most one rating, you can simplify the JOIN solutions to (and get rid of all the other queries):

    SELECT f.*
    FROM film AS f
        JOIN film_rating_report AS frr
             ON f.filmid = frr.filmid
    WHERE frr.rating = 'GE' ;
like image 26
ypercubeᵀᴹ Avatar answered Oct 17 '22 23:10

ypercubeᵀᴹ