Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple use of LEFT JOIN brings only 1 row

It is an intelligent tag base image search system. User adds images with its proper tags in such a table:

image (id, title, ...)
tag (id, title) /* It doesn't matter who has created the tag*/
imagetag (image_id, tag_id) /* One image may have multiple tags */

User views images and the visits from *those images' tags* are logged in usertagview table. (Note that I've used an INSERT ON DUPLICATE UPDATE query for that purpose.)

usertagview (user_id, tag_id, view_count)

Now please consider some images with the following tags:

  • river, day (It's a picture that shows a river in a sunny day)
  • river, night (That river at the light of the midnight moon)
  • tree, day
  • tree, night
  • flower, day
  • flower, night

User searches for the tag river and any images that has the tag river is displayed: In this case the first image (tagged by river day) and the second (tagged by river night) are shown. User views the second image (tagged by river and night) and view it is logged in the table usertagview.

Then the user tries a new search for the tag tree and views the tree night image.

I want that if the user searches for flower, the flower night be preferred over the flower day. I mean flower night should come before flower day. In other words I want a query that lists images tagged by flower according to user's previous views. (flower night first, OTHER flowers next).

My query that was failed:

SELECT

    DISTINCT (image.id) AS image_id,
    image.title AS image_title,
    SUM(usertagview.view_count) AS SUM_of_all_tag_views_for_each_image

FROM (image)

JOIN imagetag ON imagetag.image_id = image.id

**LEFT JOIN** usertagview ON
    usertagview.tag_id = imagetag.tag_id
    AND usertagview.user_id = {$user_id_from_php}

WHERE

    imagetag.tag_id IN ( {impolde(',', $array_of_id_of_tags_that_the_user_has_entered)} )
    AND
    usertagview.tag_id IN
        (SELECT tag_id FROM imagetag WHERE userimagetag.image_id = image.id)

ORDER BY SUM_of_all_tag_views_for_each_image DESC

THE PROBLEM

is that the **LEFT JOIN** in my query has no difference with a normal INNER JOIN. They both have the same result. Even if I use RIGHT JOIN it will have no difference.

like image 286
Mohammad Naji Avatar asked Feb 21 '23 15:02

Mohammad Naji


2 Answers

The reason that your left join is behaving the same as an inner join is because you have additional criteria for your left join in your where clause. This essentially turns your outer join into an inner join.

The reason for this is because if usertagview.tag_id is NULL in the case where there is no matching record, your IN statement in your WHERE clause eliminates the row with the NULL value.

To correct this, you could move your usertagview.tag_id IN ... check into your join's ON clause.

However, this is only half of your problem. You're only checking the views for the specific tag that was entered by the user, but if I understand your actual requirements, you want to check the views for any tags that are associated with any image that has a tag that matches your search term.

For example, when the user enters flower, you want to first find any image that is tagged with flower, and then check the views for all other tags for that set of images.

I believe the following query accomplishes this, and this SQL Fiddle shows the query in action:

SELECT
  i.id AS image_id,
  i.title AS image_title,
  IFNULL(SUM(utv.view_count), 0) AS associated_view_totals
FROM
  imagetag originalTag
  JOIN imagetag associatedTags 
    ON associatedTags.image_id = originalTag.image_id
  JOIN image i 
    ON i.id = associatedTags.image_id
  LEFT JOIN usertagview utv 
    ON utv.user_id = 1
    AND utv.tag_id = associatedTags.tag_id
WHERE
  -- User searches for flower tag (Let's assume 5 == flower)...
  originalTag.tag_id IN (5)
GROUP BY
  i.id,
  i.title
ORDER BY 
  associated_view_totals DESC
like image 126
Michael Fredrickson Avatar answered Feb 23 '23 04:02

Michael Fredrickson


This is a common problem. And fortunately, an easy one to solve.

See this?

LEFT JOIN usertagview ON
    usertagview.tag_id = imagetag.tag_id -- see this?
    AND usertagview.user_id = {$user_id_from_php}  

WHERE

    imagetag.tag_id IN ( {impolde(',', $array_of_id_of_tags_that_the_user_has_entered)} )
    AND

And this?

    usertagview.tag_id IN -- and this?
        (SELECT tag_id FROM imagetag WHERE userimagetag.image_id = image.id)

Both conditions share the same field, i.e. usertagview.tag_id. So that usertagview.tag_id IN (SELECT tag_id FROM ...) on your WHERE clause basically cancels out whatever success usertagview on LEFT JOINing the imagetag had.

So to fix your query, restore your INNER JOIN-y usertagview to a LEFT JOIN one, then move usertagview condition to JOIN condition instead:

SELECT

    DISTINCT (image.id) AS image_id,
    image.title AS image_title,
    SUM(usertagview.view_count) AS SUM_of_all_tag_views_for_each_image

FROM (image)

JOIN imagetag ON imagetag.image_id = image.id

LEFT JOIN usertagview ON
    usertagview.tag_id = imagetag.tag_id
    AND usertagview.user_id = {$user_id_from_php}

    -- moved the WHERE condition here
    AND
    usertagview.tag_id IN
        (SELECT tag_id FROM imagetag WHERE userimagetag.image_id = image.id)


WHERE

    imagetag.tag_id IN ( {impolde(',', $array_of_id_of_tags_that_the_user_has_entered)} )

ORDER BY SUM_of_all_tag_views_for_each_image DESC

That would fix it. If it isn't (as I don't know exactly on your tables which are one-to-many to each other, or one-to-one to each other, so in this case I'll just throw what would usually works), try to change the INNER JOIN imagetag to LEFT JOIN. And since the imagetag condition in WHERE clause will cancel out whatever rows that results from LEFT JOIN condition had, move that imagetag condition from WHERE clause to LEFT JOIN as well :

SELECT

    DISTINCT (image.id) AS image_id,
    image.title AS image_title,
    SUM(usertagview.view_count) AS SUM_of_all_tag_views_for_each_image

FROM (image)

LEFT JOIN imagetag ON imagetag.image_id = image.id
    -- WHERE clause condition moved here.
    -- WHERE conditionXXX basically cancels out whatever rows
    -- obtained from `LEFT JOIN ON conditionXXX`, in which conditionXXX share
    -- the same field.  IN this case, it is imagetag.
    AND    
    imagetag.tag_id IN ( {impolde(',', $array_of_id_of_tags_that_the_user_has_entered)} )


LEFT JOIN usertagview ON
    usertagview.tag_id = imagetag.tag_id
    AND usertagview.user_id = {$user_id_from_php}

 -- moved here
    AND
    usertagview.tag_id IN
        (SELECT tag_id FROM imagetag WHERE userimagetag.image_id = image.id)



ORDER BY SUM_of_all_tag_views_for_each_image DESC

And if the second suggestion still doesn't deliver the results, your query is currently handling multiple one-to-many table relationships. SQL can't figure out your intent if you have multiple one-to-many table relations in queries; in this case, you need to flatten out the results to get the correct output. Here's a good walkthrough on how to flatten out results: http://www.anicehumble.com/2012/05/sql-count-computer-program-does-what.html

like image 38
Michael Buen Avatar answered Feb 23 '23 05:02

Michael Buen