week cookie
1 a
1 b
1 c
1 d
2 a
2 b
3 a
3 c
3 d
This table represent someone visits a website in a particular week. Each cookie represents an individual person. Each entry represent someone visit this site in a particular week. For example, the last entry means 'd' come to the site in week 3.
I want to find out how many (same) people keep coming back in the following week, when given a start week to look at.
For example, if I look at week 1. I will get result like:
1 | 4
2 | 2
3 | 1
Because 4 user came in week 1. Only 2 of them (a,b) came back in week 2. Only 1 (a) of them came in all of these 3 weeks.
How can I do a select query to find out? The table will be big: there might be 100 weeks, so I want to find the right way to do it.
We can use many different clauses to change the behaviour of the SELECT statement. In this chapter we will describe and give examples of some common ones. Display the number of rows in the table. We can use the COUNT() function to return the number of rows that matches a specified criteria.
Complex SQL is the use of SQL queries which go beyond the standard SQL of using the SELECT and WHERE commands. Complex SQL often involves using complex joins and sub-queries, where queries are nested in WHERE clauses. Complex queries frequently involve heavy use of AND and OR clauses.
Advanced queries are typically used for reporting, joining multiple tables, nesting queries, and transaction locking. All of these concepts are covered in this article. Using Aliases in Your Queries. Aliases let you create a shortcut name for different table options.
This query uses variables to track adjacent weeks and work out if they are consecutive:
set @start_week = 2, @week := 0, @conseq := 0, @cookie:='';
select conseq_weeks, count(*)
from (
select
cookie,
if (cookie != @cookie or week != @week + 1, @conseq := 0, @conseq := @conseq + 1) + 1 as conseq_weeks,
(cookie != @cookie and week <= @start_week) or (cookie = @cookie and week = @week + 1) as conseq,
@cookie := cookie as lastcookie,
@week := week as lastweek
from (select week, cookie from webhist where week >= @start_week order by 2, 1) x
) y
where conseq
group by 1;
This is for week 2. For another week, change the start_week
variable at the top.
Here's the test:
create table webhist(week int, cookie char);
insert into webhist values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'c'), (3, 'd');
Output of above query with where week >= 1
:
+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
| 1 | 4 |
| 2 | 2 |
| 3 | 1 |
+--------------+----------+
Output of above query with where week >= 2
:
+--------------+----------+
| conseq_weeks | count(*) |
+--------------+----------+
| 1 | 2 |
| 2 | 1 |
+--------------+----------+
p.s. Good question, but a bit of a ball-breaker
For some reason most of these answers are very over complicated, it doesn't need cursors or for loops or anything of the sort...
I want to find out how many (same) people keep coming back in the following week, when given a start week to look at.
If you want to know how many users for any week visited one week and then the week after for each future week:
SELECT visits.week, COUNT(1) AS [NumRepeatUsers]
FROM visits
WHERE EXISTS (
SELECT TOP 1 1
FROM visits AS nextWeek
WHERE nextWeek.week = visits.week+1
AND nextWeek.cookie = visits.cookie
)
AND EXISTS (
SELECT TOP 1 1
FROM visits AS searchWeek
WHERE searchWeek.week = @week
AND nextWeek.cookie = visits.cookie
)
GROUP BY visits.week
ORDER BY visits.week
However this will not show you diminishing results over time if you have 10 users in week 1, and then 5 different users visited for the next 5 weeks you would keep seeing 1=10,2=5,3=5,4=5,5=5,6=5 and so on, instead you want to see that 5=x where x is the number of users who visited every week for 5 weeks straight. To do this, see below:
SELECT visits.week, COUNT(1) AS [NumRepeatUsers]
FROM visits
WHERE EXISTS (
SELECT TOP 1 1
FROM visits AS nextWeek
WHERE nextWeek.week = visits.week+1
AND nextWeek.cookie = visits.cookie
)
AND EXISTS (
SELECT TOP 1 1
FROM visits AS searchWeek
WHERE searchWeek.week = @week
AND nextWeek.cookie = visits.cookie
)
AND visits.week - @week = (
SELECT COUNT(1) AS [Count]
FROM visits AS searchWeek
WHERE searchWeek.week BETWEEN @week+1 AND visits.week
AND nextWeek.cookie = visits.cookie
)
GROUP BY visits.week
ORDER BY visits.week
This will give you 1=10,2=5,3=4,4=3,5=2,6=1 or the like
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With