Situation
We have a PostgreSQL 8.4 database containing user sessions with login date/time and logout date/time per row. Our web application records this time and also handles the case when user does not logout explicitly (session timeout). So a login date/time and logout date/time are given in every case.
Goal
I need user statistics of the max number of concurrent sessions a day. So, I can say the following: "At 2015-03-16 the peak of concurrent users logged in was six."
Similar questions
A similar question has been answered here: SQL max concurrent sessions per hour of day However, I was unable to adapt the solution to my case, where I want to have a result table, which shows the max. number of concurrent user sessions per day and not per hour. The table scheme varies also slightly, because one row in my case contains the login and the logout date/time together, whereas in the example each row represent either a login or a logout. In addition, the question is based on a MS SQL database environment instead of PostgreSQL.
Considerations
Table scheme:
user_id | login_date | login_time | logout_date | logout_time
------------+--------------+--------------+---------------+-------------
USER32 | 2014-03-03 | 08:23:00 | 2014-03-03 | 14:44:00
USER82 | 2014-03-03 | 08:49:00 | 2014-03-03 | 17:18:00
USER83 | 2014-03-03 | 09:40:00 | 2014-03-03 | 17:31:00
USER36 | 2014-03-03 | 09:50:00 | 2014-03-03 | 16:10:00
USER37 | 2014-03-03 | 11:44:00 | 2014-03-03 | 15:21:00
USER72 | 2014-03-03 | 12:52:00 | 2014-03-03 | 12:55:00
Example
Following example illustrated as timeline via Google Charts API should help understand the problem: http://i.imgur.com/ZOjnLll.png
Given this example of a day 2015-03-03, all users except of USER78 (6 users) were logged in between 12:52 and 12:55 that day. This is the max number of concurrently logged in users and I need such a statistic per day for a given time range.
Day | MaxNumberOfConcurrentSessions
------------+--------------------------------
2015-03-01 | 2
2015-03-02 | 3
2015-03-03 | 6
...
Example of timeline screenshot above as Google Charts API.
google.setOnLoadCallback(drawChart);
function drawChart() {
var container = document.getElementById('example5.1');
var chart = new google.visualization.Timeline(container);
var dataTable = new google.visualization.DataTable();
dataTable.addColumn({ type: 'string', id: 'Room' });
dataTable.addColumn({ type: 'string', id: 'Name' });
dataTable.addColumn({ type: 'date', id: 'Start' });
dataTable.addColumn({ type: 'date', id: 'End' });
dataTable.addRows([
["USER78", '', new Date(2014,03,03,20,38), new Date(2014,03,03,21,14)],
["USER83", '', new Date(2014,03,03,09,40), new Date(2014,03,03,17,31)],
["USER72", '', new Date(2014,03,03,08,43), new Date(2014,03,03,08,43)],
["USER72", '', new Date(2014,03,03,09,40), new Date(2014,03,03,09,40)],
["USER72", '', new Date(2014,03,03,10,03), new Date(2014,03,03,10,06)],
["USER72", '', new Date(2014,03,03,12,52), new Date(2014,03,03,12,55)],
["USER72", '', new Date(2014,03,03,21,13), new Date(2014,03,03,21,13)],
["USER72", '', new Date(2014,03,03,21,37), new Date(2014,03,03,21,38)],
["USER72", '', new Date(2014,03,03,23,14), new Date(2014,03,03,23,15)],
["USER72", '', new Date(2014,03,03,23,27), new Date(2014,03,03,23,28)],
["USER36", '', new Date(2014,03,03,08,05), new Date(2014,03,03,09,17)],
["USER36", '', new Date(2014,03,03,09,50), new Date(2014,03,03,16,10)],
["USER36", '', new Date(2014,03,03,16,12), new Date(2014,03,03,20,29)],
["USER32", '', new Date(2014,03,03,08,23), new Date(2014,03,03,14,44)],
["USER82", '', new Date(2014,03,03,08,49), new Date(2014,03,03,17,18)],
["USER37", '', new Date(2014,03,03,08,04), new Date(2014,03,03,08,06)],
["USER37", '', new Date(2014,03,03,11,44), new Date(2014,03,03,15,21)],
["USER37", '', new Date(2014,03,03,15,34), new Date(2014,03,03,15,51)],
["USER37", '', new Date(2014,03,03,16,12), new Date(2014,03,03,16,14)],
["USER37", '', new Date(2014,03,03,16,52), new Date(2014,03,03,16,54)],
["USER37", '', new Date(2014,03,03,17,07), new Date(2014,03,03,17,08)],
["USER37", '', new Date(2014,03,03,20,20), new Date(2014,03,03,20,24)],
["USER37", '', new Date(2014,03,03,21,03), new Date(2014,03,03,21,20)],
["USER37", '', new Date(2014,03,03,22,42), new Date(2014,03,03,23,05)],
["USER37", '', new Date(2014,03,03,23,51), new Date(2014,03,03,23,56)],
["USER01", '', new Date(2014,03,03,16,11), new Date(2014,03,03,16,12)]
]);
var options = {
timeline: { colorByRowLabel: true }
};
chart.draw(dataTable, options);
}
<script type="text/javascript" src="https://www.google.com/jsapi?autoload={'modules':[{'name':'visualization',
'version':'1','packages':['timeline']}]}"></script>
<div id="example5.1" style="width:5000px;height: 600px;"></div>
I would serialize logins and logouts with UNION ALL
, "in" counts as 1, "out" counts as -1. Then compute a running count with a simple window function and get the max per day.
Since it has not been specified, assuming that:
WITH range AS (SELECT '2014-03-01'::date AS start_date -- time range
, '2014-03-31'::date AS end_date) -- inclusive bounds
, cte AS (
SELECT *
FROM tbl, range r
WHERE login_date <= r.end_date
AND logout_date >= r.start_date
)
, ct AS (
SELECT log_date, sum(ct) OVER (ORDER BY log_date, log_time, ct) AS session_ct
FROM (
SELECT logout_date AS log_date, logout_time AS log_time, -1 AS ct FROM cte
UNION ALL
SELECT login_date, login_time, 1 FROM cte
) sub
)
SELECT log_date, max(session_ct) AS max_sessions
FROM ct, range r
WHERE log_date BETWEEN r.start_date AND r.end_date -- crop actual time range
GROUP BY 1
ORDER BY 1;
You might use the OVERLAPS
operator in cte
:
AND (login_date, logout_date) OVERLAPS (r.start_date, r.end_date)
Details:
But that might not be a good idea because (per documentation):
Each time period is considered to represent the half-open interval
start
<=
time
<
end
, unlessstart
andend
are equal in which case it represents that single time instant. This means for instance that two time periods with only an endpoint in common do not overlap.
Bold emphasis mine. The upper bound of your range would have to be the day after your desired time frame.
CTE are available since Postgres 8.4.
The 1st CTE range
is just for convenience of providing the time range once.
The 2nd CTE cte
selects only relevant rows: those that ...
The 3rd CTE ct
serializes "in" and "out" points with values of +/-1 and computes a running count with the aggregate function sum()
used as window function. Those are available since Postgres 8.4.
In the final SELECT
trim leading and trailing days and aggregate the maximum per day. Voilá.
SQL Fiddle for Postgres 9.6.
Postgres 8.4 is too old and not available any more, but should work the same. I added a rows to the test case - one spanning multiple days. Should make it more useful.
I would generally use timestamp
instead of date
and time
. Same size, easier to handle. Or timestamptz
if multiple time zones can be involved.
An index on (login_date, logout_date DESC)
is instrumental for performance as a bare minimum.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With