First of all, this is a homework assignment. I have already thought of and written a solution, but I am a bit aggravated by the fact that I am unable to profile it successfully or obtain a second opinion as to whether it is in fact a good one.
Say I have a simple table of film participation information (person, film, persons relationship to the film) like the following:
create table film
(
person_name varchar(48) not null,
film_title varchar(128) not null,
relation varchar(48) not null
);
-- { 'Mel Gibson', 'Braveheart', 'director' }
-- { 'Mel Gibson', 'Braveheart', 'cast' }
-- { 'Steven Spielberg', 'A.I.' , 'director' }
-- { 'Hilary Swank', 'Million Dollar Baby', 'cast' }
-- etc
The database and the table are not created or maintained by me, I am simply to query information from it.
I need to produce a set of person names for directors who are cast (act in front of the camera) in each and every film they have directed. The condition should not hold for people who have directed at least one film in which they didn't act, or for people who haven't directed anything. Whether these directors also have relationships with films they didn't direct, is of no consequence here. Basically, the query can be expressed as "get me list of people who acted in every film they have directed".
My query (that as far as I can attest produces correct result set), lo and behold:
(
select person_name
from film
where relation = 'director'
)
except
(
select person_name
from
(
(
select person_name, film_title
from film
where relation = 'director'
)
except
(
select person_name, film_title
from film
where relation = 'cast'
)
) as director_behind_camera_for_film
)
I would like to know whether the query is sound, or if I have been thinking completely wrong about this? If the latter, could you please provide me with a better solution or an explanation thereof?
Don't pay much attention to the fact that I use strings everywhere (where surrogate keys could have been used insted) - this is a simplified example, yet it still demonstrates my challenge.
The SQL SELECT statement returns a result set of records, from one or more tables. A SELECT statement retrieves zero or more rows from one or more database tables or database views. In most applications, SELECT is the most commonly used data manipulation language (DML) command.
The ORDER BY command is used to sort the result set in ascending or descending order.
Query optimization is the overall process of choosing the most efficient means of executing a SQL statement. SQL is a nonprocedural language, so the optimizer is free to merge, reorganize, and process in any order. The database optimizes each SQL statement based on statistics collected about the accessed data.
SELECT tmp.person_name FROM
(
SELECT person_name, film_title, COUNT(relationship) as cnt
FROM film
WHERE relationship IN ('cast', 'director')
GROUP BY person_name, film_title
) as tmp
GROUP BY person_name
HAVING SUM(cnt) = COUNT(cnt)*2
or
SELECT tmp.person_name FROM
(
SELECT person_name, film_title, COUNT(DISTINCT(relationship)) as cnt
FROM film
WHERE relationship IN ('cast', 'director')
GROUP BY person_name, film_title
) as tmp
GROUP BY person_name
HAVING SUM(cnt) = COUNT(cnt)*2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With