Say I have 2 tables, one called categories and one called cat_pages.
The categories table has columns ID, title and timestamp. For example:
CREATE TABLE categories (
id INT UNSIGNED PRIMARY KEY,
title VARCHAR(32),
`timestamp` TIMESTAMP,
INDEX (title)
) Engine=InnoDB;
The cat_pages has 2 columns, cat_id and page_id:
CREATE TABLE cat_pages (
cat_id INT UNSIGNED
REFERENCES categories (id)
ON DELETE CASCADE ON UPDATE CASCADE,
page_id INT UNSIGNED
REFERENCES pages (id)
ON DELETE CASCADE ON UPDATE CASCADE,
UNIQUE INDEX (cat_id, page_id),
INDEX (page_id, cat_id),
) Engine=InnoDB;
I'm trying to join the categories table with the cat_pages table on the ID, such that
The query:
SELECT * FROM categories as c
LEFT JOIN cat_pages as p ON c.id = p.cat_id
produces a result set that has the categories repeated multiple times (as there are multiple matches in the cat_pages table. What do I need so that each category is only shown once, and not at all if there are no matches in the cat_pages table?
If you don't want categories that aren't in cat_pages
, don't use a left join; use an inner join. A left join includes every row from the left table, even if there isn't a matching row in the right table (the missing fields are given NULL values). A right join is similar, but includes all rows from the right table. An outer join includes all rows from the left and right tables, joining rows that have matches and joining rows without matches with NULL values. An inner join, by contrast, only includes matching rows. To put it another way, the intersection of left and right joins is an inner join; their union is an outer join. Jeff Atwood posted some nice Venn diagrams describing joins, though it should be noted that the sets in the circles aren't properly the left and right tables, but rather the results of the left and right joins of the left and right tables.
To get distinct rows, you can use a DISTINCT
modifier:
SELECT DISTINCT c.*
FROM categories AS c
INNER JOIN cat_pages AS cp ON c.id = cp.cat_id
As for SELECT * ...
, see "What is the reason not to use select *?"
Another approach to getting distinct rows would be to use an EXISTS
clause or IN
operator, but the join is likely more performant (though only an EXPLAIN
would tell you for certain). Just make sure you have appropriate indices set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With