I have several tables with many to many relationship. For example table video schema
CREATE TABLE `videos` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`description` varchar(10000) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=47606 DEFAULT CHARSET=utf8;
and two tables actors film_actors, and for example writers and film_writers
actor and film actors schemas
CREATE TABLE `actors` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `actors_UN` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=152216 DEFAULT CHARSET=utf8;
CREATE TABLE `film_actors` (
`actor_id` int(11) NOT NULL,
`film_id` int(11) NOT NULL,
PRIMARY KEY (`film_id`,`actor_id`),
KEY `FKrs472oyyff3hfwq10pyo94k1d` (`actor_id`),
CONSTRAINT `FK12uvap3je50qd8cq3s0jf7h7r` FOREIGN KEY (`film_id`) REFERENCES `videos` (`id`) ON DELETE CASCADE,
CONSTRAINT `FKrs472oyyff3hfwq10pyo94k1d` FOREIGN KEY (`actor_id`) REFERENCES `actors` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
writer
CREATE TABLE `writers` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `writers_UN` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=39875 DEFAULT CHARSET=utf8;
CREATE TABLE `film_writers` (
`film_id` int(11) NOT NULL,
`writer_id` int(11) NOT NULL,
PRIMARY KEY (`writer_id`,`film_id`),
KEY `FKh0kfwnarp6utb4f80ycj1lap` (`film_id`),
CONSTRAINT `FK4hwpb5l48m0xps6jqn1wyjb63` FOREIGN KEY (`writer_id`) REFERENCES `writers` (`id`) ON DELETE CASCADE,
CONSTRAINT `FKh0kfwnarp6utb4f80ycj1lap` FOREIGN KEY (`film_id`) REFERENCES `videos` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Im try selecet film info with actors and writers, but result contains many dublicate row. I use this sql query
SELECT
vid.id,
vid.title,
vid.original_title_name,
vid.`year`,
ac.id as actor_id,
ac.name as actor_name,
wr.id as writer_id,
wr.name as writer_name
from
(
select
*
From
videos v
where
v.id = 1722
) vid
JOIN film_actors fa ON
vid.id = fa.film_id
JOIN actors ac ON
fa.actor_id = ac.id
JOIN film_writers fw ON
vid.id = fw.film_id
JOIN writers wr ON
fw.writer_id = wr.id
The output is a lot of duplicates row.
id |title |original_title_name |year |actor_id |actor_name |writer_id |writer_name |
-----|----------------|--------------------|-----|---------|-------------------|----------|---------------|
1722 |Назад в будущее |Back to the Future |1985 |1796 |Майкл Дж. Фокс |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |2648 |Джордж ДиЧенцо |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |4807 |Криспин Гловер |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |7601 |Кристофер Ллойд |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |8195 |Лиа Томпсон |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |8707 |Марк МакКлюр |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |9242 |Фрэнсис Ли МакКейн |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |9602 |Уэнди Джо Спербер |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |10545 |Клаудия Уэллс |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |10546 |Томас Ф. Уилсон |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |1796 |Майкл Дж. Фокс |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |2648 |Джордж ДиЧенцо |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |4807 |Криспин Гловер |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |7601 |Кристофер Ллойд |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |8195 |Лиа Томпсон |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |8707 |Марк МакКлюр |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |9242 |Фрэнсис Ли МакКейн |320 |Роберт Земекис |
Is it possible to somehow convert the result to something like that?
id |title |original_title_name |year |actor_id |actor_name |writer_id |writer_name |
-----|----------------|--------------------|-----|---------|-------------------|----------|---------------|
1722 |Назад в будущее |Back to the Future |1985 |1796 |Майкл Дж. Фокс |319 |Боб Гейл |
1722 |Назад в будущее |Back to the Future |1985 |2648 |Джордж ДиЧенцо |320 |Роберт Земекис |
1722 |Назад в будущее |Back to the Future |1985 |4807 |Криспин Гловер | | |
1722 |Назад в будущее |Back to the Future |1985 |7601 |Кристофер Ллойд | | |
1722 |Назад в будущее |Back to the Future |1985 |8195 |Лиа Томпсон | | |
1722 |Назад в будущее |Back to the Future |1985 |8707 |Марк МакКлюр | | |
1722 |Назад в будущее |Back to the Future |1985 |9242 |Фрэнсис Ли МакКейн | | |
1722 |Назад в будущее |Back to the Future |1985 |9602 |Уэнди Джо Спербер | | |
1722 |Назад в будущее |Back to the Future |1985 |10545 |Клаудия Уэллс | | |
1722 |Назад в будущее |Back to the Future |1985 |10546 |Томас Ф. Уилсон | | |
The reason you're getting "duplicate" rows is because one film has many actors and one film has many writers. There is no relationship at all between a film actor and a film writer
A simple query for education purposes:
SELECT *
FROM
films
INNER JOIN actors on films.id = actors.filmid
INNER JOIN writers on films.id = writers.filmid
This is simplified to make my point, it hides the complexity of your tables that break the many:many relationships down to many:one
Now. If a film has 10 actors and 11 writers, this query will produce 110 rows
Every actor will be associated with every writer
Why? Because there is absolutely no relationship between a film actor and a film writer. The two things have nothing at all to do with each other. We haven't written anything in a JOIN condition to assert that some actor attribute is equal to some writer attribute because we can't - there just isn't anything to relate the two of them together
The only thing the database can possibly do as a result, is produce a row set that has a row for every actor/writer combination. The actor data is repeated 11 times and the writer data is repeated 10 times
Actor1/Writer1
Actor1/Writer2
...
Actor2/Writer1
Actor2/Writer2
...
Actor10/Writer11
There's no way round this "problem"-it is caused by trying to put the two unrelated things together on a single query. The only "solution" is to not put them in the same query. If you're writing a website like IMDB and you have a webpage for a film with two tabs, one for actors, and one for writers, run two separate queries (film join actors) and (film join writers) to fill the data for each tab - you flat out can't* do it in one query.
*when I say "can't", I mean "really really shouldn't". The "desired results" grid you've posted associates actor 1796 with writer 319 for no good reason at all - everything on a row is supposed to be related and these two entities share no relation other than some arbitrary decision that they both are ranked first when their ids are ordered ascending. While there are ways that that could be used to associate them and eliminate the Cartesian product from your query, it's a terrible code smell and indicative of the fact that you're approaching some other problem (that we can't see) in entirely the wrong way (sorry)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With