Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In SQL how to select previous rows based on the current row values?

I've a simple SQL table which looks like this-

CREATE TABLE msg (
    from_person character varying(10),
    from_location character varying(10),
    to_person character varying(10),
    to_location character varying(10),
    msglength integer,
    ts timestamp without time zone
);

sample data

I want to find out for each row in the table if a different 'from_person' and 'from_location' has interacted with the 'to_person' in the current row in last 3 minutes.

For example, in above table, for row # 4, other than mary from Mumbai (current row), nancy from NYC and bob from Barcelona has also sent a message to charlie in last 3 minutes so the count is 2.

Similarly, for row#2, other than bob from Barcelona (current row), only nancy from NYC has sent a message to charlie in ca (current row) so the count is 1

Example desired output-

0
1
0
2

I tried using window function but it seems that in frame clause I can specify rows count before and after but I can't specify a time itself.

like image 220
user375868 Avatar asked Mar 14 '23 09:03

user375868


2 Answers

As is well known, every table in Postgres has a primary key. Or should have at least. It would be great if you had a primary key defining expected order of rows.

Example data:

create table msg (
    id int primary key,
    from_person text,
    to_person text,
    ts timestamp without time zone
);

insert into msg values
(1, 'nancy',   'charlie', '2016-02-01 01:00:00'),
(2, 'bob',     'charlie', '2016-02-01 01:00:00'),
(3, 'charlie', 'nancy',   '2016-02-01 01:00:01'),
(4, 'mary',    'charlie', '2016-02-01 01:02:00');

The query:

select m1.id, count(m2)
from msg m1
left join msg m2
on m2.id < m1.id
and m2.to_person = m1.to_person
and m2.ts >= m1.ts- '3m'::interval
group by 1
order by 1;

 id | count 
----+-------
  1 |     0
  2 |     1
  3 |     0
  4 |     2
(4 rows)

In the lack of a primary key you can use the function row_number(), for example:

with msg_with_rn as (
    select *, row_number() over (order by ts, from_person desc) rn
    from msg
    )
select m1.id, count(m2)
from msg_with_rn m1
left join msg_with_rn m2
on m2.rn < m1.rn
and m2.to_person = m1.to_person
and m2.ts >= m1.ts- '3m'::interval
group by 1
order by 1;

Note that I have used row_number() over (order by ts, from_person desc) to get the sequence of rows as you have presented in the question. Of course, you should decide yourself how to resolve ambiguities arising from the same values of the column ts (as in the first two rows).

like image 74
klin Avatar answered Mar 16 '23 15:03

klin


This should more or less do it. Depending on your requirements, you may need to modify the middle two conditions in the where clause:

select *,
   (select count(*) from msg m2
    where m2.to_person = m1.to_person
        and m2.from_person != m1.from_person
        and m2.from_location != m1.from_location
        and abs(EXTRACT(EPOCH FROM (m1.ts - m2.ts))) <= 3*60)
from msg m1
like image 23
Tim Jasko Avatar answered Mar 16 '23 14:03

Tim Jasko