Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to edit my query to have a more accurate user retention result?

Tags:

mysql

My goal is to find how many of the visitors that entered my website during the last 7 days, have also visited today. That means if someone visited 05-02 and today, it will count it on 05-02.

The problem with my query is that if someone visited 05-01 and 05-02 and also today, it will count him only on 05-01. But I want the query to count him on 05-02 also.

How can I edit the query below to do this?

SELECT COUNT( DISTINCT v.`hash` ) hashCount, DATE( v.`timestamp` ) AS views
FROM audience v
INNER JOIN behaviour b ON v.`hash` = b.`hash` 
WHERE v.timestamp
BETWEEN SUBDATE( CURDATE( ) , 7 ) 
AND CURDATE( ) 
AND DATE( b.timestamp ) = CURDATE( ) 
GROUP BY views
like image 381
villoui Avatar asked May 04 '16 13:05

villoui


Video Answer


2 Answers

To make it easier just take all people who visited totay and look for a visit in the past 7 days and if one is found return the last date of visit. i think this approach is easier to comprehend:

select b.`hash` ,  
       audience.last_visit 
from behaviour b
 inner join (select v.`hash`, max(v.timestamp) as last_visit from audience v
  where DATE(v.timestamp) between date_sub(current_date, interval 7 day) and 
  date_sub(current_date, interval 1 day)  
  group by v.`hash`) as audience 
 on(b.`hash` = audience.`hash`)
where DATE(b.timestamp) = CURDATE();
  • We join on a select that already consists of the information we need (latest visit per hash during the last 7 days).

The number of rows returned is the number of visitors who visited your page today and during the last week.

You can also count from your select to get the number you were looking for as a query result:

select count(*) from    
(select b.`hash` ,  
       audience.last_visit 
from behaviour b
 inner join (select v.`hash`, max(v.timestamp) as last_visit from audience v
  where DATE(v.timestamp) between date_sub(current_date, interval 7 day) and 
  date_sub(current_date, interval 1 day)  
  group by v.`hash`) as audience 
 on(b.`hash` = audience.`hash`)
where DATE(b.timestamp) = CURDATE() ) as my_visitors;

- Testdata

    drop table if exists your_schema.behaviour;
    create table your_schema.behaviour(`hash` varchar(255), `timestamp` timestamp) ;

    insert into your_schema.behaviour
    values ('ab','2016-05-23'),('ac','2016-05-23');

    drop table if exists your_schema.audience;
    create table your_schema.audience (`hash` varchar(255), `timestamp` timestamp) ;
insert into your_schema.audience
values ('ab','2016-05-01'),('ab','2016-05-02'),('ab','2016-05-03'),('ab','2016-05-04'),('ab','2016-05-21'),('ab','2016-05-23'),
       ('ac','2016-05-01'),('ac','2016-05-02'),('ac','2016-05-03'),('ac','2016-05-04'),('ac','2016-05-21'),('ac','2016-05-23'),
       ('ad','2016-05-01'), ('ad','2016-05-02'), ('ad','2016-05-03'),('ad','2016-05-04'),('ad','2016-05-21'),('ad','2016-05-23');
like image 164
iLikeMySql Avatar answered Nov 04 '22 10:11

iLikeMySql


You can also do by using DATE_SUB, check following query

SELECT COUNT(DISTINCT v.`hash`) AS hashCount, DATE(v.`timestamp`) AS views
FROM audience v
INNER JOIN behaviour b ON v.`hash` = b.`hash` 
WHERE views >= DATE_SUB(Now(),INTERVAL 7 DAY)
GROUP BY views

for more information visit following url

https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html

like image 41
chetan varma Avatar answered Nov 04 '22 10:11

chetan varma