Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL query - possible to include this clause?

Tags:

sql

mysql

I have the following query, which retrieves 4 adverts from certain categories in a random order.

At the moment, if a user has more than 1 advert, then potentially all of those ads might be retrieved - I need to limit it so that only 1 ad per user is displayed.

Is this possible to achieve in the same query?

SELECT      a.advert_id, a.title, a.url, a.user_id, 
            FLOOR(1 + RAND() * x.m_id) 'rand_ind' 

FROM        adverts AS a
INNER JOIN  advert_categories AS ac
ON          a.advert_id = ac.advert_id,
(
            SELECT MAX(t.advert_id) - 1 'm_id' 
            FROM adverts t
)           x

WHERE       ac.category_id IN 
(
            SELECT category_id
            FROM website_categories
            WHERE website_id = '8'
)
AND         a.advert_type = 'text'

GROUP BY    a.advert_id
ORDER BY    rand_ind 
LIMIT       4
like image 865
BrynJ Avatar asked Jan 15 '11 13:01

BrynJ


People also ask

Can we use with clause in MySQL?

The WITH clause in MySQL is used to specify a Common Table Expression, a with clause can have one or more comms-separated subclauses.

Can we use CTE in MySQL?

In MySQL every query generates a temporary result or relation. In order to give a name to those temporary result set, CTE is used. A CTE is defined using WITH clause. Using WITH clause we can define more than one CTEs in a single statement.

How can we define the clause FROM in MySQL?

customer_id = order_details. customer_id ORDER BY order_id; This MySQL FROM clause example uses the FROM clause to list two tables - customers and order_details. And we are using the FROM clause to specify an INNER JOIN between the customers and order_details tables based on the customer_id column in both tables.

Is subquery allowed in WHERE clause in MySQL?

In MySQL subquery can be nested inside a SELECT, INSERT, UPDATE, DELETE, SET, or DO statement or inside another subquery. A subquery is usually added within the WHERE Clause of another SQL SELECT statement.


1 Answers

Note: The solution is the last query at the bottom of this answer.

Test Schema and Data

create table adverts (
    advert_id int primary key, title varchar(20), url varchar(20), user_id int, advert_type varchar(10))
;
create table advert_categories (
    advert_id int, category_id int, primary key(category_id, advert_id))
;
create table website_categories (
    website_id int, category_id int, primary key(website_id, category_id))
;
insert website_categories values
    (8,1),(8,3),(8,5),
    (1,1),(2,3),(4,5)
;
insert adverts (advert_id, title, user_id) values
    (1, 'StackExchange', 1),
    (2, 'StackOverflow', 1),
    (3, 'SuperUser', 1),
    (4, 'ServerFault', 1),
    (5, 'Programming', 1),
    (6, 'C#', 2),
    (7, 'Java', 2),
    (8, 'Python', 2),
    (9, 'Perl', 2),
   (10, 'Google', 3)
;
update adverts set advert_type = 'text'
;
insert advert_categories values
    (1,1),(1,3),
    (2,3),(2,4),
    (3,1),(3,2),(3,3),(3,4),
    (4,1),
    (5,4),
    (6,1),(6,4),
    (7,2),
    (8,1),
    (9,3),
   (10,3),(10,5)
;

Data properties

  • each website can belong to multiple categories
  • for simplicity, all adverts are of type 'text'
  • each advert can belong to multiple categories. If a website has multiple categories that are matched multiple times in advert_categories for the same user_id, this causes the advert_id's to show twice when using a straight join between 3 tables in the next query.

This query joins the 3 tables together (notice that ids 1, 3 and 10 each appear twice)

select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
inner join adverts a on a.advert_id = ac.advert_id and  a.advert_type = 'text'
where wc.website_id='8'
order by a.advert_id

To make each website show only once, this is the core query to show all eligible ads, each only once

        select *
        from adverts a
        where a.advert_type = 'text'
          and exists (
            select *
            from website_categories wc
            inner join advert_categories ac on wc.category_id = ac.category_id
            where wc.website_id='8'
              and a.advert_id = ac.advert_id)

The next query retrieves all the advert_id's to be shown

select advert_id, user_id
from (
    select
        advert_id, user_id,
        @r := @r + 1 r
    from (select @r:=0) r
    cross join 
    (
        # core query -- vvv
        select a.advert_id, a.user_id
        from adverts a
        where a.advert_type = 'text'
          and exists (
            select *
            from website_categories wc
            inner join advert_categories ac on wc.category_id = ac.category_id
            where wc.website_id='8'
              and a.advert_id = ac.advert_id)
        # core query -- ^^^
        order by rand()
    ) EligibleAdsAndUserIDs
) RowNumbered
group by user_id
order by r
limit 2

There are 3 levels to this query

  1. aliased EligibleAdsAndUserIDs: core query, sorted randomly using order by rand()
  2. aliased RowNumbered: row number added to core query, using MySQL side-effecting @variables
  3. the outermost query forces mysql to collect rows as numbered randomly in the inner queries, and group by user_id causes it to retain only the first row for each user_id. limit 2 causes the query to stop as soon as two distinct user_id's have been encountered.

This is the final query which takes the advert_id's from the previous query and joins it back to table adverts to retrieve the required columns.

  1. only once per user_id
  2. feature user's with more ads proportionally (statistically) to the number of eligible ads they have

Note: Point (2) works because the more ads you have, the more likely you will hit the top placings in the row numbering subquery

select a.advert_id, a.title, a.url, a.user_id
from
(
    select advert_id
    from (
        select
            advert_id, user_id,
            @r := @r + 1 r
        from (select @r:=0) r
        cross join 
        (
            # core query -- vvv
            select a.advert_id, a.user_id
            from adverts a
            where a.advert_type = 'text'
              and exists (
                select *
                from website_categories wc
                inner join advert_categories ac on wc.category_id = ac.category_id
                where wc.website_id='8'
                  and a.advert_id = ac.advert_id)
            # core query -- ^^^
            order by rand()
        ) EligibleAdsAndUserIDs
    ) RowNumbered
    group by user_id
    order by r
    limit 2
) Top2
inner join adverts a on a.advert_id = Top2.advert_id;
like image 189
RichardTheKiwi Avatar answered Sep 18 '22 11:09

RichardTheKiwi