Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MYSQL JOIN with unique result rows

Tags:

sql

join

mysql

Say I have 2 tables, one called categories and one called cat_pages.

The categories table has columns ID, title and timestamp. For example:

CREATE TABLE categories (
    id INT UNSIGNED PRIMARY KEY,
    title VARCHAR(32),
    `timestamp` TIMESTAMP,
    INDEX (title)
) Engine=InnoDB;

The cat_pages has 2 columns, cat_id and page_id:

CREATE TABLE cat_pages (
    cat_id INT UNSIGNED 
        REFERENCES categories (id) 
        ON DELETE CASCADE ON UPDATE CASCADE,
    page_id INT UNSIGNED
        REFERENCES pages (id) 
        ON DELETE CASCADE ON UPDATE CASCADE,
    UNIQUE INDEX (cat_id, page_id),
    INDEX (page_id, cat_id),
) Engine=InnoDB;

I'm trying to join the categories table with the cat_pages table on the ID, such that

  1. Only categories with id's in the category_pages table are retrieved and
  2. Each category is only displayed once in the resultset

The query:

SELECT * FROM categories as c
    LEFT JOIN cat_pages as p ON c.id = p.cat_id

produces a result set that has the categories repeated multiple times (as there are multiple matches in the cat_pages table. What do I need so that each category is only shown once, and not at all if there are no matches in the cat_pages table?

like image 687
tgriesser Avatar asked Dec 28 '22 22:12

tgriesser


1 Answers

If you don't want categories that aren't in cat_pages, don't use a left join; use an inner join. A left join includes every row from the left table, even if there isn't a matching row in the right table (the missing fields are given NULL values). A right join is similar, but includes all rows from the right table. An outer join includes all rows from the left and right tables, joining rows that have matches and joining rows without matches with NULL values. An inner join, by contrast, only includes matching rows. To put it another way, the intersection of left and right joins is an inner join; their union is an outer join. Jeff Atwood posted some nice Venn diagrams describing joins, though it should be noted that the sets in the circles aren't properly the left and right tables, but rather the results of the left and right joins of the left and right tables.

To get distinct rows, you can use a DISTINCT modifier:

SELECT DISTINCT c.* 
  FROM categories AS c 
    INNER JOIN cat_pages AS cp ON c.id = cp.cat_id

As for SELECT * ..., see "What is the reason not to use select *?"

Another approach to getting distinct rows would be to use an EXISTS clause or IN operator, but the join is likely more performant (though only an EXPLAIN would tell you for certain). Just make sure you have appropriate indices set.

like image 95
outis Avatar answered Dec 30 '22 10:12

outis