Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select users who did a purchase after a search where some conditions apply in PostgreSQL?

Tags:

sql

postgresql

I have this case: users are making searches on the website and some of them purchase. How would you select users over 18 who did a search without a following purchase where purchase condition is true and search session = purchase session

This selects users with search events and age over 18.

select DISTINCT ON (id) id, email, users.age, events.type, events.createdon 
from users 
   LEFT JOIN events ON events.user = users.users 
where events.type='search' 
  and age>18 
  and events.condition is true

How would I add the condition that they did a search event previously in the same session?

The table structure for events:

-----------------------------------------------------
| user |   event   |    date    | condition | session |
------------------------------------------------------|
|  1   |  search   | 08-10-2013 |   true    |    A    |
|  1   |  search   | 08-10-2013 |   true    |    A    |
|  2   |  search   | 08-10-2013 |   false   |    B    |
|  2   | purchase  | 09-10-2013 |   false   |    A    |
|  2   |  search   | 09-10-2013 |   true    |    C    |
|  1   | purchase  | 09-10-2013 |   true    |    A    |
|  3   |  search   | 09-10-2013 |   false   |    D    |
|  2   |  search   | 10-10-2013 |   true    |    H    |
|  4   |  search   | 10-10-2013 |   false   |    E    |
|  4   |  search   | 10-10-2013 |   false   |    E    |
|  3   |  search   | 11-10-2013 |   true    |    D    |
|  2   |  other    | 11-10-2013 |   true    |    H    |
|  1   |  search   | 11-10-2013 |   true    |    F    |
|  1   | purchase  | 12-10-2013 |   true    |    F    |
|  3   | purchase  | 12-10-2013 |   false   |    D    |
|  4   |  search   | 12-10-2013 |   true    |    G    |    
|  2   |  other    | 12-10-2013 |   true    |    A    |   
-----------------------------------------------------

and the user table is:

------------------------
| user | email  | age  |
------------------------
|  1   |   a    |  22  |
|  2   |   b    |  34  |
|  3   |   c    |  15  |
|  4   |   d    |  44  |
|  5   |   e    |  39  |
------------------------

The result should be 2 and 4:

1    a   -> NO because did purchase with condition=true
2    b   -> YES because did search, did purchase BUT condition=false
3    c   -> NO because did search, did purchase with condition=false BUT age<18
4    d   -> YES because did search and no purchase 

Thank you, I'm just starting with Postgres from Mongo and is much better!

UPDATE: corrected result for example

like image 360
Alexandru R Avatar asked Dec 02 '25 03:12

Alexandru R


2 Answers

If you need just data from users table:

select
    u.*
from users as u
where
    u.age > 18 and
    exists (
        select *
        from events as e1
        where 
            e1.user = u.user and e1.condition is true and
            e1.event = 'search' and
            not exists (
                select *
                from events as e2
                where
                    e2.user = u.user and e2.condition is true and
                    e2.event = 'purchase' and
                    e2.session = e1.session and e2.date > e1.date
            )
    )

if's also could be done like this:

select
    u.*
from users as u
where
    u.age > 18 and
    exists (
        select *           
        from (
            select
                max(case when e.event = 'search' then e.date end) as search_m_date,
                max(case when e.event = 'purchase' then e.date end) as purchase_m_date
            from events as e
            where e.user = u.user and e.condition is true
            group by e.session
        ) as a
        where
            a.search_m_date is not null and
            (a.purchase_m_date is null or a.search_m_date > a.purchase_m_date)
    )

but actually I like first one better

sql fiddle demo

like image 58
Roman Pekar Avatar answered Dec 04 '25 20:12

Roman Pekar


Here's one way to go about building the complete query step by step:

  1. Get all search events:

    SELECT *
    FROM events AS s
    WHERE event = 'search'
      AND condition IS TRUE
    
  2. Filter out those that are followed by a successful purchase event:

    SELECT *
    FROM
      events AS s
    LEFT JOIN
      events AS p
      ON  s.user = p.user
      AND s.session = p.session
      AND p.event = 'purchase'
      AND p.condition IS TRUE
      AND p.date > s.date  -- this assumes that `date` stores both date and time
    WHERE s.event = 'search'
      AND s.condition IS TRUE
      AND p.event IS NULL
    
  3. Get users from the users table whose IDs are returned in the previous result set and whose age is above 18:

    SELECT *
    FROM users
    WHERE age > 18
      AND user IN (
        SELECT
          s.user
        FROM
          events AS s
        LEFT JOIN
          events AS p
          ON  s.user = p.user
          AND s.session = p.session
          AND p.event = 'purchase'
          AND p.condition IS TRUE
          AND p.date > s.date
        WHERE s.event = 'search'
          AND s.condition IS TRUE
          AND p.event IS NULL
      )
    ;
    

This should give you the desired results although not necessarily in the most efficient way. You could play with the anti-join, which in this query is implemented using the LEFT JOIN + WHERE IS NULL technique, rewriting it with the NOT EXISTS one to see if that performs any better. Additionally, you could try rewriting the IN predicate with an equivalent EXISTS one. (Essentially, @Roman Pekar's first solution could be viewed as the result of both rewritings suggested.)

like image 31
Andriy M Avatar answered Dec 04 '25 20:12

Andriy M