Am looking for a single query that can convert the following information in the table
name:time :state
a :10:00 AM:login
b :10:05 AM:login
a :10:06 AM:chatting
a :10:08 AM:Idle
b :10:11 AM:chatting
a :10:10 AM:Logout
b :10:12 AM:Logout
to something like this (given the time range 10 AM to 10:15 AM as the querying period)
name: State :Duration
a : chatting :2 Minutes
a : Idle :2 Minutes
b : chatting :1 Minute
Can this be done ONLY using SQL? Am using Informix version 11.5
It can be done in a single SQL statement. Here's the proof.
CREATE TEMP TABLE eventtable
(
name CHAR(3) NOT NULL,
time DATETIME HOUR TO MINUTE NOT NULL,
state CHAR(8) NOT NULL
);
INSERT INTO eventtable(name, time, state) VALUES('a', '10:00', 'login');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:05', 'login');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:06', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:08', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:11', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:10', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:12', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:01', 'login');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:02', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:03', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:04', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:05', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:06', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:07', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:08', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:09', 'login');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:11', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:12', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:13', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:14', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:15', 'Logout');
Note the conditions. The result table must exclude the periods between 'login' and the first other event; further, it must exclude the period between 'Logout' and the next event (presumably a 'login'). The self-join between the table on the name column and then the asymmetric join on the time column (using '<
') ensures that events are in time order. The NOT EXISTS sub-select ensures that only adjacent events are considered. Using BETWEEN AND in the sub-query is a mistake because it includes its end points and it is crucial that r1.time
and r2.time
are excluded from the range; it took me a few minutes to spot that bug (the query ran but returned no rows, but why?)!
SELECT r1.name, r1.state, r2.TIME - r1.TIME AS duration
FROM eventtable r1, eventtable r2
WHERE r1.name = r2.name
AND r1.time < r2.time
AND r1.state != 'login'
AND r1.state != 'Logout'
AND r1.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND r2.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND NOT EXISTS (SELECT 1 FROM eventtable r3
WHERE r3.time > r1.time AND r3.time < r2.time
AND r3.name = r1.name
AND r3.name = r2.name);
This produces the answer:
name state duration
a chatting 0:02
a Idle 0:02
b chatting 0:01
c chatting 0:01
c Idle 0:01
c Idle 0:01
c Idle 0:01
c chatting 0:01
c Idle 0:01
c chatting 0:01
c Idle 0:01
The 'duration' value is an INTERVAL HOUR TO MINUTE; if you want a value in just minutes, you have to convert it with a cast (using 4 for the precision to allow for intervals up to 1440 minutes, or 1 day; the data is ambiguous for longer time frames):
(r2.time - r1.time)::INTERVAL MINUTE(4) TO MINUTE
Or:
CAST (r2.time - r1.time AS INTERVAL MINUTE(4) TO MINUTE)
IBM Informix Dynamic Server (IDS) has very verbose notations for time constants. In Standard SQL, you could use TIME as the type and TIME '10:00:00' as a value, but the seconds would be necessary in strict standard SQL. IDS does provide exact types that people want - such as DATETIME HOUR TO MINUTE. You'd also write INTERVAL MINUTE(4) in standard SQL; the 'TO MINUTE' should be optional.
In my comment to Ray Hidayat's answer, I pointed out that the EXISTS sub-query is necessary to ensure that the events under consideration are contiguous - there are no intervening events. Here's the same query with start and end times added to the output, and the EXISTS clause missing (and 'duration' renamed to 'lapse'):
SELECT r1.name, r1.state, r2.TIME - r1.TIME AS lapse,
r1.time AS start, r2.time AS end
FROM eventtable r1, eventtable r2
WHERE r1.name = r2.name
AND r1.time < r2.time
AND r1.state != 'login'
AND r1.state != 'Logout'
AND r1.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND r2.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE;
This produces the answer:
name state lapse start end
a chatting 0:04 10:06 10:10
a chatting 0:02 10:06 10:08
a Idle 0:02 10:08 10:10
b chatting 0:01 10:11 10:12
c chatting 0:13 10:02 10:15
c chatting 0:12 10:02 10:14
c chatting 0:11 10:02 10:13
c chatting 0:10 10:02 10:12
c chatting 0:09 10:02 10:11
c chatting 0:07 10:02 10:09
c chatting 0:06 10:02 10:08
c chatting 0:05 10:02 10:07
c chatting 0:04 10:02 10:06
c chatting 0:03 10:02 10:05
c chatting 0:02 10:02 10:04
c chatting 0:01 10:02 10:03
c Idle 0:12 10:03 10:15
c Idle 0:11 10:03 10:14
c Idle 0:10 10:03 10:13
c Idle 0:09 10:03 10:12
c Idle 0:08 10:03 10:11
c Idle 0:06 10:03 10:09
c Idle 0:05 10:03 10:08
c Idle 0:04 10:03 10:07
c Idle 0:03 10:03 10:06
c Idle 0:02 10:03 10:05
c Idle 0:01 10:03 10:04
c Idle 0:10 10:05 10:15
c Idle 0:09 10:05 10:14
c Idle 0:08 10:05 10:13
c Idle 0:07 10:05 10:12
c Idle 0:06 10:05 10:11
c Idle 0:04 10:05 10:09
c Idle 0:03 10:05 10:08
c Idle 0:02 10:05 10:07
c Idle 0:01 10:05 10:06
c Idle 0:08 10:07 10:15
c Idle 0:07 10:07 10:14
c Idle 0:06 10:07 10:13
c Idle 0:05 10:07 10:12
c Idle 0:04 10:07 10:11
c Idle 0:02 10:07 10:09
c Idle 0:01 10:07 10:08
c chatting 0:04 10:11 10:15
c chatting 0:03 10:11 10:14
c chatting 0:02 10:11 10:13
c chatting 0:01 10:11 10:12
c Idle 0:03 10:12 10:15
c Idle 0:02 10:12 10:14
c Idle 0:01 10:12 10:13
c chatting 0:02 10:13 10:15
c chatting 0:01 10:13 10:14
c Idle 0:01 10:14 10:15
This shows how each eligible start row for user 'c' is matched with each eligible end row, giving many spurious rows of data. The NOT EXISTS sub-query is a common theme when dealing with time-based queries. You can find information about these operations in Snodgrass's "Developing Time-Oriented Applications in SQL" (PDF available online at URL), and in Date, Darwen and Lorentzos "Temporal Data and the Relational Model".
I'm pretty sure it can be done using only SQL, it's going to take me quite a bit of time to come up with a query for you, I'll edit it in when I'm done. The basic steps I think would first be to calculate the amount of time each one takes (done by taking each entry and joining it to the next entry and subtracting to find the time difference) then a simple group by clause with a sum will easily get it into the form you've described.
Edit: Here what I came up with
SELECT l.userid, l.state, SUM(t.minutes) AS duration
FROM Log l
INNER JOIN (
SELECT l1.id, (l2.time - l1.time) AS minutes
FROM Log l1, Log l2
WHERE l2.time == (
-- find the next entry --
SELECT TOP 1 ls.time
FROM Log ls
WHERE ls.Time > l1.Time && ls.userid = l1.userid
ORDER BY ls.Time
)
) t ON l.id == t.id
GROUP BY l.userid, l.state
ORDER BY l.userid
This is semi-pseudocode, I made up all the table names and things, and you won't be able to just subtract one time from another, you'll probably be using the DATEDIFF function. Besides that though, I think that's the gist of it. I think SQL is one of the most amazing languages, you can do almost anything with little code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With