Postgres, table1 left join table2 with only 1 row per ID in table1

Question

Ok, so the title is a bit convoluted. This is basically a greatest-n-per-group type problem, but I can't for the life of me figure it out.

I have a table, user_stats:

------------------+---------+---------------------------------------------------------
 id               | bigint  | not null default nextval('user_stats_id_seq'::regclass)
 user_id          | bigint  | not null
 datestamp        | integer | not null
 post_count       | integer | 
 friends_count    | integer | 
 favourites_count | integer |  
Indexes:
    "user_stats_pk" PRIMARY KEY, btree (id)
    "user_stats_datestamp_index" btree (datestamp)
    "user_stats_user_id_index" btree (user_id)
Foreign-key constraints:
    "user_user_stats_fk" FOREIGN KEY (user_id) REFERENCES user_info(id)

I want to get the stats for each id by latest datestamp. This is a biggish table, somewhere in the neighborhood of 41m rows, so I've created a temp table of user_id, last_date using:

CREATE TEMP TABLE id_max_date AS
    (SELECT user_id, MAX(datestamp) AS date FROM user_stats GROUP BY user_id);

The problem is that datestamp isn't unique since there can be more than 1 stat update in a day (should have been a real timestamp but the guy who designed this was kind of an idiot and theres too much data to go back at the moment). So some IDs have multiple rows when I do the JOIN:

SELECT user_stats.user_id, user_stats.datestamp, user_stats.post_count,
       user_stats.friends_count, user_stats.favorites_count
  FROM id_max_date JOIN user_stats
    ON id_max_date.user_id=user_stats.user_id AND date=datestamp;

If I was doing this as subselects I guess I could LIMIT 1, but I've always heard those are horribly inefficient. Thoughts?

rfusca · Accepted Answer

DISTINCT ON is your friend.

select distinct on (user_id) * from user_stats order by datestamp desc;

Postgres, table1 left join table2 with only 1 row per ID in table1

Tags:

sql

postgresql

greatest-n-per-group

Peck

1 Answers

rfusca

Recent Activity

Donate For Us

Postgres, table1 left join table2 with only 1 row per ID in table1

Tags:

sql

postgresql

greatest-n-per-group

Peck

1 Answers

rfusca

Related questions

Recent Activity

Donate For Us