I have a transactional database with sales data and user id like the following:
id_usuarioweb dt_fechaventa
1551415 2015-08-01 14:57:21.737
1551415 2015-08-06 15:34:21.920
6958538 2015-07-30 09:26:24.427
6958538 2015-08-05 09:30:06.247
6958538 2015-08-31 17:39:02.027
39101175 2015-08-05 16:34:17.990
39101175 2015-09-20 20:37:26.043
1551415 2015-09-05 13:41:43.767
3673384 2015-09-06 13:34:23.440
And I would like to calculate the average diference between dates by the same customer in the data base (to find average frequency with which the user buys).
I'm aware I can do datediff with two columns, but i'm have issues trying to do it in the same field and "grouping" by user id.
The desired outcome would be like this:
id_usuarioweb avgtime_days
1551415 5
6958538 25
39101175 25
1551415 0
3673384 0
How can I achieve this? I would have the database ordered by user_id and then dt_fechaventa (the sale time).
USING: SQL Server 2008
Answer: To find the average time between two dates, you could try the following: SELECT TO_DATE(date1, 'yyyy/mm/dd') + ((TO_DATE(date2, 'yyyy/mm/dd') - TO_DATE(date1, 'yyyy/mm/dd')) /2 ) FROM dual; This will calculate the elapsed time between date1 and date2.
Note: DATEADD and DATEDIFF SQL function can be used in the SELECT, WHERE, HAVING, GROUP BY and ORDER BY clauses.
To calculate this average, you can use AVG command. – Averages per minute: to obtain the averages per minute, you must retrieve E3TimeStamp's seconds and milliseconds. To do so, multiply this field by 24 (to convert the time base into hours), and then by 60 (to convert it into minutes).
select username, avg(datediff(ss, start_date, end_date)) as avg_seconds ... datediff can measure the diff in any time unit up to years by varying the first parameter, which can be ss, mi, hh, dd, wk, mm or yy. Save this answer.
I think what you are looking for is calculated like this. Take the maximum and minimum dates, get the difference between them and divide by the number of purchases.
SELECT id_usuarioweb, CASE
WHEN COUNT(*) < 2
THEN 0
ELSE DATEDIFF(dd,
MIN(
dt_fechaventa
), MAX(
dt_fechaventa
)) / (
COUNT(*) -
1
)
END AS avgtime_days
FROM mytable
GROUP BY id_usuarioweb
EDIT: (by @GordonLinoff)
The reason that this is correct is easily seen if you look at the math. Consider three dates, a, b, and c.
The average time between them is:
((b - a) + (c - b)) / 2
This simplifies to:
(c - a) / 2
In other words, the intermediate value cancels out. And, this continues regardless of the number of intermediate values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With