Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advanced SQL Select Query

week      cookie
1         a
1         b
1         c
1         d
2         a 
2         b
3         a
3         c
3         d

This table represent someone visits a website in a particular week. Each cookie represents an individual person. Each entry represent someone visit this site in a particular week. For example, the last entry means 'd' come to the site in week 3.

I want to find out how many (same) people keep coming back in the following week, when given a start week to look at.

For example, if I look at week 1. I will get result like:

1 | 4
2 | 2
3 | 1

Because 4 user came in week 1. Only 2 of them (a,b) came back in week 2. Only 1 (a) of them came in all of these 3 weeks.

How can I do a select query to find out? The table will be big: there might be 100 weeks, so I want to find the right way to do it.

like image 391
JJ Liu Avatar asked Jul 27 '11 00:07

JJ Liu


People also ask

What is advanced SELECT in SQL?

We can use many different clauses to change the behaviour of the SELECT statement. In this chapter we will describe and give examples of some common ones. Display the number of rows in the table. We can use the COUNT() function to return the number of rows that matches a specified criteria.

What is considered a complex SQL query?

Complex SQL is the use of SQL queries which go beyond the standard SQL of using the SELECT and WHERE commands. Complex SQL often involves using complex joins and sub-queries, where queries are nested in WHERE clauses. Complex queries frequently involve heavy use of AND and OR clauses.

What is advanced querying?

Advanced queries are typically used for reporting, joining multiple tables, nesting queries, and transaction locking. All of these concepts are covered in this article. Using Aliases in Your Queries. Aliases let you create a shortcut name for different table options.


2 Answers

This query uses variables to track adjacent weeks and work out if they are consecutive:

set @start_week = 2, @week := 0, @conseq := 0, @cookie:='';
select conseq_weeks, count(*)
from (
select 
  cookie,
  if (cookie != @cookie or week != @week + 1, @conseq := 0, @conseq := @conseq + 1) + 1 as conseq_weeks,
  (cookie != @cookie and week <= @start_week) or (cookie = @cookie and week = @week + 1) as conseq,
  @cookie := cookie as lastcookie,
  @week := week as lastweek
from (select week, cookie from webhist where week >= @start_week order by 2, 1) x
) y
where conseq
group by 1;

This is for week 2. For another week, change the start_week variable at the top.

Here's the test:

create table webhist(week int, cookie char);
insert into webhist values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'c'), (3, 'd');

Output of above query with where week >= 1:

+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
|            1 |        4 |
|            2 |        2 |
|            3 |        1 |
+--------------+----------+

Output of above query with where week >= 2:

+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
|            1 |        2 |
|            2 |        1 |
+--------------+----------+

p.s. Good question, but a bit of a ball-breaker

like image 156
Bohemian Avatar answered Nov 10 '22 04:11

Bohemian


For some reason most of these answers are very over complicated, it doesn't need cursors or for loops or anything of the sort...

I want to find out how many (same) people keep coming back in the following week, when given a start week to look at.

If you want to know how many users for any week visited one week and then the week after for each future week:

SELECT visits.week, COUNT(1) AS [NumRepeatUsers]
FROM visits 
WHERE EXISTS (
    SELECT TOP 1 1 
    FROM visits AS nextWeek 
    WHERE nextWeek.week = visits.week+1 
      AND nextWeek.cookie = visits.cookie
  )
  AND EXISTS (
    SELECT TOP 1 1 
    FROM visits AS searchWeek
    WHERE searchWeek.week = @week 
      AND nextWeek.cookie = visits.cookie
  )
GROUP BY visits.week
ORDER BY visits.week

However this will not show you diminishing results over time if you have 10 users in week 1, and then 5 different users visited for the next 5 weeks you would keep seeing 1=10,2=5,3=5,4=5,5=5,6=5 and so on, instead you want to see that 5=x where x is the number of users who visited every week for 5 weeks straight. To do this, see below:

SELECT visits.week, COUNT(1) AS [NumRepeatUsers]
FROM visits 
WHERE EXISTS (
    SELECT TOP 1 1 
    FROM visits AS nextWeek 
    WHERE nextWeek.week = visits.week+1 
      AND nextWeek.cookie = visits.cookie
  )
  AND EXISTS (
    SELECT TOP 1 1 
    FROM visits AS searchWeek
    WHERE searchWeek.week = @week 
      AND nextWeek.cookie = visits.cookie
  )
  AND visits.week - @week = (
    SELECT COUNT(1) AS [Count]
    FROM visits AS searchWeek
    WHERE searchWeek.week BETWEEN @week+1 AND visits.week
      AND nextWeek.cookie = visits.cookie
  )
GROUP BY visits.week
ORDER BY visits.week

This will give you 1=10,2=5,3=4,4=3,5=2,6=1 or the like

like image 29
Seph Avatar answered Nov 10 '22 03:11

Seph