I need to collect some statistical information in my application. I have a table of users (tb_user) Every time a new user accesses the application, it adds a new record in this table, ie, one line for each user. The main field are id and date_hour (timestamp for the first time user accessed the application).
tb_user
id (bigint) | date_time (timestamp with time zone)
1 | 2012-01-29 11:29:50.359-03
2 | 2012-01-31 14:27:10.359-03
I need get:
amount average users by day, week and month
Example:
by day: 55.45
by week : XX.XX
month: XX.XX
EDIT:
My best solution was:
WITH daily_count AS (SELECT COUNT(id) AS user_count FROM tb_user)
SELECT user_count, tbaux2.days, (user_count/tbaux2.days) FROM daily_count,
(SELECT EXTRACT(DAY FROM (t2.diff) ) + 1 AS days
FROM
(with tbaux AS(SELECT min(date_time) AS min FROM tb_user)
SELECT (now() - min) AS diff
FROM tbaux) AS t2) AS tbaux2
GROUP BY user_count, tbaux2.days
But this solution only worked with EXTRACT (DAY ... With weeks and month did not work
Any help is welcome.
Alternatively:
SELECT user_count, tbaux2.days, (user_count/tbaux2.days) AS userPerDay, ((user_count/tbaux2.days) * 7) AS userPerWeek, ((user_count/tbaux2.days) * 30) AS userPerMonth
EDIT 2:
Based on responses from @Bruno, there are some considerations:
When I asked the question, in really I requested a way to select data by day, month and year. I believe that the search that I posted and @Bruno refined, should be interpreted as average of "a day, every 7 days and every 30 days" and not by days, weeks and months. I believe that if it is interpreted in this way, there not will be problems of gender-quoted in example (10% drop). I believe this approach of "every" is answer I need in moment, so will sign this answer.
I suggest as an improvement of post:
Thanks.
You should look into aggregate functions (min, max, count, avg), which go hand in hand with GROUP BY
. For date-based aggregations, date_trunc
is also useful.
For example, this will return the number of rows per day:
SELECT date_trunc('day', date_time) AS day_start,
COUNT(id) AS user_count FROM tb_user
GROUP BY date_trunc('day', date_time);
You can then do the daily average using something like this (with a CTE):
WITH daily_count AS (SELECT date_trunc('day', date_time) AS day_start,
COUNT(id) AS user_count FROM tb_user
GROUP BY date_trunc('day', date_time))
SELECT AVG(user_count) FROM daily_count;
Use 'week'
instead of day for the weekly counts, and so on (see date_trunc
documentation).
EDIT: (Following comment: average up to and including 5/1/2012, i.e. before the 6th.)
WITH daily_count AS (SELECT date_trunc('day', date_time) AS day_start,
COUNT(id) AS user_count
FROM tb_user
WHERE date_time >= DATE('2012-01-01') AND date_time < DATE('2012-01-06')
GROUP BY date_trunc('day', date_time))
SELECT SUM(user_count)/(DATE('2012-01-06') - DATE('2012-01-01')) FROM daily_count;
What's above is over-complicated, in this case. This should give you the same result:
SELECT COUNT(id)/(DATE('2012-01-06') - DATE('2012-01-01'))
FROM tb_user
WHERE date_time >= DATE('2012-01-01') AND date_time < DATE('2012-01-06');
EDIT 2: After your edit, I guess what you're after is just a single global average for the entire period of existence of your database, rather than groups by month/week/day.
This should give you the average number of rows per day:
WITH total_min_max AS (SELECT
COUNT(id) AS total_visits,
MIN(date_time) AS first_date_time,
MAX(date_time) AS last_date_time,
FROM tb_user)
SELECT total_visits/((last_date_time::date-first_date_time::date)+1) AS users_per_day
FROM total_min_max
(I would replace last_date_time
with NOW()
to make the average over the time until now, rather than until the last visit, if there's no recent visit.)
Then, for daily, weekly, and "monthly":
WITH daily_avg AS (
WITH total_min_max AS (SELECT
COUNT(id) AS total_visits,
MIN(date_time) AS first_date_time,
MAX(date_time) AS last_date_time,
FROM tb_user)
SELECT total_visits/((last_date_time::date-first_date_time::date)+1) AS users_per_day
FROM total_min_max)
SELECT
users_per_day,
(users_per_day * 7) AS users_per_week,
(users_per_month * 30) AS users_per_month
FROM daily_avg
This being said, conclusions you draw from such statistics might not be great, especially if you want to see how it changes.
I would also normalise the data per day rather than assuming 30 days in a month (if not per hour, because not all days have 24 hours). Say you have 10 visits per day in Jan 2011 and 10 visits per day in Feb 2011. That gives you 310 visits in Jan and 280 visits in Feb. If you don't pay attention, you could think you've had a almost a 10% drop in terms of number of visitors, so something went wrong in Feb, when really, this isn't the case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With