Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do a Postgresql subquery in select clause with join in from clause like SQL Server?

I am trying to write the following query on postgresql:

select name, author_id, count(1),      (select count(1)     from names as n2     where n2.id = n1.id         and t2.author_id = t1.author_id     )                from names as n1 group by name, author_id 

This would certainly work on Microsoft SQL Server but it does not at all on postegresql. I read its documentation a bit and it seems I could rewrite it as:

select name, author_id, count(1), total                      from names as n1, (select count(1) as total     from names as n2     where n2.id = n1.id         and n2.author_id = t1.author_id     ) as total group by name, author_id 

But that returns the following error on postegresql: "subquery in FROM cannot refer to other relations of same query level". So I'm stuck. Does anyone know how I can achieve that?

Thanks

like image 890
Ricardo Avatar asked Jun 09 '10 10:06

Ricardo


People also ask

Can you use a subquery with a JOIN?

A subquery can be used with JOIN operation. In the example below, the subquery actually returns a temporary table which is handled by database server in memory. The temporary table from the subquery is given an alias so that we can refer to it in the outer select statement.

Can subquery be used in FROM clause in SQL?

From clause can be used to specify a sub-query expression in SQL. The relation produced by the sub-query is then used as a new relation on which the outer query is applied. Sub queries in the from clause are supported by most of the SQL implementations.

Can subquery appear in select clause?

A subquery can occur in the select list of another SELECT statement.


2 Answers

I'm not sure I understand your intent perfectly, but perhaps the following would be close to what you want:

select n1.name, n1.author_id, count_1, total_count   from (select id, name, author_id, count(1) as count_1           from names           group by id, name, author_id) n1 inner join (select id, author_id, count(1) as total_count               from names               group by id, author_id) n2   on (n2.id = n1.id and n2.author_id = n1.author_id) 

Unfortunately this adds the requirement of grouping the first subquery by id as well as name and author_id, which I don't think was wanted. I'm not sure how to work around that, though, as you need to have id available to join in the second subquery. Perhaps someone else will come up with a better solution.

Share and enjoy.

like image 108

Complementing @Bob Jarvis and @dmikam answer, Postgres don't perform a good plan when you don't use LATERAL, below a simulation, in both cases the query data results are the same, but the cost are very different

Table structure

CREATE TABLE ITEMS (     N INTEGER NOT NULL,     S TEXT NOT NULL );  INSERT INTO ITEMS   SELECT     (random()*1000000)::integer AS n,     md5(random()::text) AS s   FROM     generate_series(1,1000000);  CREATE INDEX N_INDEX ON ITEMS(N); 

Performing JOIN with GROUP BY in subquery without LATERAL

EXPLAIN  SELECT      I.* FROM ITEMS I INNER JOIN (     SELECT          COUNT(1), n     FROM ITEMS     GROUP BY N ) I2 ON I2.N = I.N WHERE I.N IN (243477, 997947); 

The results

Merge Join  (cost=0.87..637500.40 rows=23 width=37)   Merge Cond: (i.n = items.n)   ->  Index Scan using n_index on items i  (cost=0.43..101.28 rows=23 width=37)         Index Cond: (n = ANY ('{243477,997947}'::integer[]))   ->  GroupAggregate  (cost=0.43..626631.11 rows=861418 width=12)         Group Key: items.n         ->  Index Only Scan using n_index on items  (cost=0.43..593016.93 rows=10000000 width=4) 

Using LATERAL

EXPLAIN  SELECT      I.* FROM ITEMS I INNER JOIN LATERAL (     SELECT          COUNT(1), n     FROM ITEMS     WHERE N = I.N     GROUP BY N ) I2 ON 1=1 --I2.N = I.N WHERE I.N IN (243477, 997947); 

Results

Nested Loop  (cost=9.49..1319.97 rows=276 width=37)   ->  Bitmap Heap Scan on items i  (cost=9.06..100.20 rows=23 width=37)         Recheck Cond: (n = ANY ('{243477,997947}'::integer[]))         ->  Bitmap Index Scan on n_index  (cost=0.00..9.05 rows=23 width=0)               Index Cond: (n = ANY ('{243477,997947}'::integer[]))   ->  GroupAggregate  (cost=0.43..52.79 rows=12 width=12)         Group Key: items.n         ->  Index Only Scan using n_index on items  (cost=0.43..52.64 rows=12 width=4)               Index Cond: (n = i.n) 

My Postgres version is PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)

like image 27
deFreitas Avatar answered Oct 21 '22 21:10

deFreitas