I built a query, to let me return average ranking of some Apps.
But for some of them, for some month, we don't have data, cause the app is a new one (let's say, the app is out since this month, and so we are collecting data since this month)
SELECT
DATE_FORMAT(date, '%Y-%m'),
app_id,
AVG(rank)
FROM wadstats.applestore_ranking
where app_id IN (100,
2,
3,
4,
5,
6)
GROUP BY MONTH(date), app_id
ORDER BY CASE WHEN app_id = 100 THEN 1 ELSE 2 END, date ASC
I was in need to show first the app_id = 100
But for app_id = 8, I don't have data for august for exemple.
Then results looks like
'2015-07', '100', '3.9355'
'2015-04', '100', '49.5000'
'2015-08', '100', '5.2258'
'2015-05', '100', '16.3333'
'2015-09', '100', '6.1333'
'2015-06', '100', '7.5667'
'2015-10', '100', '5.7727'
'2015-04', '2', '6.0000'
'2015-08', '2', '9.8710'
'2015-05', '2', '6.4667'
'2015-09', '2', '8.9667'
'2015-06', '2', '8.5333'
'2015-10', '2', '9.9545'
'2015-07', '2', '10.5806'
'2015-05', '3', '56.3929'
'2015-09', '3', '55.1667'
'2015-06', '3', '35.2500'
'2015-07', '3', '38.7143'
'2015-04', '3', '38.7500'
'2015-08', '3', '52.5500'
'2015-09', '4', '30.2105'
'2015-06', '4', '27.9231'
'2015-10', '4', '30.0000'
'2015-07', '4', '47.0000'
'2015-08', '4', '32.6818'
'2015-06', '5', '46.8667'
'2015-10', '5', '86.6667'
'2015-07', '5', '63.5185'
'2015-04', '5', '24.2500'
'2015-08', '5', '67.3571'
'2015-10', '6', '30.1818'
i wish instead to have null for each month even if no data was available for this specific month
Expected results
'2015-07', '100', '3.9355'
'2015-04', '100', '49.5000'
'2015-08', '100', '5.2258'
'2015-05', '100', '16.3333'
'2015-09', '100', '6.1333'
'2015-06', '100', '7.5667'
'2015-10', '100', '5.7727'
'2015-04', '2', '6.0000'
'2015-08', '2', '9.8710'
'2015-05', '2', '6.4667'
'2015-09', '2', '8.9667'
'2015-06', '2', '8.5333'
'2015-10', '2', '9.9545'
'2015-07', '2', '10.5806'
'2015-05', '3', '56.3929'
'2015-09', '3', '55.1667'
'2015-06', '3', '35.2500'
'2015-07', '3', '38.7143'
'2015-04', '3', '38.7500'
'2015-08', '3', '52.5500'
'2015-09', '4', '30.2105'
'2015-06', '4', '27.9231'
'2015-05', '4', NULL
'2015-10', '4', '30.0000'
'2015-07', '4', '47.0000'
'2015-08', '4', '32.6818'
'2015-06', '5', '46.8667'
'2015-10', '5', '86.6667'
'2015-07', '5', '63.5185'
'2015-04', '5', '24.2500'
'2015-08', '5', '67.3571'
'2015-04', '6', NULL
'2015-05', '6', NULL
'2015-06', '6', NULL
'2015-07', '6', NULL
'2015-08', '6', NULL
'2015-09', '6', NULL
'2015-10', '6', '30.1818'
If i need to have 0 instead of NULL, that would be ok too, but I need to have for each month in the DB, to have value for each app_id
Thanks so much by advance
The query below uses a derived table (alias inr
) to put together YearMonth/app_id combinations. It then uses that in a left join
to fetch data whether or not it exists in table applestore_ranking
.
Use ifnull()
if you want a zero to appear as opposed to a NULL. So that part would become ifnull(AVG(r.rank),0) as Rank
, instead.
Note, it would help one to simply put this staging together then highlight the code for the inr
select alone and see its simple output. This would make the left join, which then follows, easier to understand.
The concept of the Helper table is used all the time in sql. Sometimes they are put together on the fly, and dropped. Other times they are permanent.
create schema appleSandbox;
use appleSandbox;
-- drop table applestore_ranking;
create table applestore_ranking
( id int auto_increment primary key,
app_id int not null,
date date not null, -- not a great column name
rank int not null
);
-- truncate table applestore_ranking;
insert applestore_ranking (app_id,date,rank) values
(2,'2015-08-01',1),(2,'2015-09-05',10),(2,'2015-09-12',11),(2,'2015-10-01',14),
(6,'2015-10-01',7),(6,'2015-10-05',6),(6,'2015-10-14',2),
(100,'2015-09-01',16),(100,'2015-10-01',16),(100,'2015-10-05',17),(100,'2015-10-14',18);
create table monthHelper
( -- load this up with a few years worth
id int auto_increment primary key,
theDate date not null, -- slightly better column name
wantToSee int not null -- do we want to see it in results or not? 0=no, 1=yes
);
-- note only a few wantToSee have been turned on to 1
insert monthHelper(theDate,wantToSee) values
('2015-05-01',0),('2015-06-01',0),('2015-07-01',0),('2015-08-01',1),('2015-09-01',1),('2015-10-01',1),('2015-11-01',0),('2015-12-01',0),
('2016-01-01',0),('2016-02-01',0),('2016-03-01',0); -- etc
SELECT DATE_FORMAT(inr.theDate, '%Y-%m') as YearMonth, inr.app_id, AVG(r.rank) as Rank
FROM
( select distinct mh.theDate,r.app_id
from monthhelper mh
cross join applestore_ranking r
where mh.wantToSee=1
and r.app_id IN (100,2,3,4,5,6)
) inr
left join applestore_ranking r
on r.app_id=inr.app_id and year(inr.theDate)=year(r.date) and month(inr.theDate)=month(r.date)
GROUP BY MONTH(inr.theDate), inr.app_id
ORDER BY CASE WHEN inr.app_id = 100 THEN 1 ELSE 2 END, inr.theDate ASC
+-----------+--------+---------+
| YearMonth | app_id | Rank |
+-----------+--------+---------+
| 2015-08 | 100 | NULL |
| 2015-09 | 100 | 16.0000 |
| 2015-10 | 100 | 17.0000 |
| 2015-08 | 2 | 1.0000 |
| 2015-08 | 6 | NULL |
| 2015-09 | 2 | 10.5000 |
| 2015-09 | 6 | NULL |
| 2015-10 | 2 | 14.0000 |
| 2015-10 | 6 | 5.0000 |
+-----------+--------+---------+
drop schema AppleSandbox;
thanks to Drew,
Here is a way to perform in a single query, what explained us in detail Drew.
Thanks a lot for your tip using combination of CROSS JOIN
SELECT
inr.date AS 'month',
inr.app_id,
r.title,
AVG(r.rank) as 'ranking'
FROM
(
SELECT
DISTINCT DATE_FORMAT(mh.date, '%Y-%m') date,
r.app_id
FROM
applestore_ranking mh
CROSS JOIN
applestore_ranking r
WHERE
mh.app_id = 100
AND
r.app_id IN (100, 1, 2, 3, 4, 5)
) inr
LEFT JOIN
applestore_ranking r ON r.app_id = inr.app_id
AND inr.date = DATE_FORMAT(r.date, '%Y-%m')
GROUP BY
inr.date,
inr.app_id
ORDER BY
CASE WHEN inr.app_id = 100 THEN 1 ELSE 2 END
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With