I am trying to write the following query on postgresql:
select name, author_id, count(1), (select count(1) from names as n2 where n2.id = n1.id and t2.author_id = t1.author_id ) from names as n1 group by name, author_id
This would certainly work on Microsoft SQL Server but it does not at all on postegresql. I read its documentation a bit and it seems I could rewrite it as:
select name, author_id, count(1), total from names as n1, (select count(1) as total from names as n2 where n2.id = n1.id and n2.author_id = t1.author_id ) as total group by name, author_id
But that returns the following error on postegresql: "subquery in FROM cannot refer to other relations of same query level". So I'm stuck. Does anyone know how I can achieve that?
Thanks
A subquery can be used with JOIN operation. In the example below, the subquery actually returns a temporary table which is handled by database server in memory. The temporary table from the subquery is given an alias so that we can refer to it in the outer select statement.
From clause can be used to specify a sub-query expression in SQL. The relation produced by the sub-query is then used as a new relation on which the outer query is applied. Sub queries in the from clause are supported by most of the SQL implementations.
A subquery can occur in the select list of another SELECT statement.
I'm not sure I understand your intent perfectly, but perhaps the following would be close to what you want:
select n1.name, n1.author_id, count_1, total_count from (select id, name, author_id, count(1) as count_1 from names group by id, name, author_id) n1 inner join (select id, author_id, count(1) as total_count from names group by id, author_id) n2 on (n2.id = n1.id and n2.author_id = n1.author_id)
Unfortunately this adds the requirement of grouping the first subquery by id as well as name and author_id, which I don't think was wanted. I'm not sure how to work around that, though, as you need to have id available to join in the second subquery. Perhaps someone else will come up with a better solution.
Share and enjoy.
Complementing @Bob Jarvis and @dmikam answer, Postgres don't perform a good plan when you don't use LATERAL, below a simulation, in both cases the query data results are the same, but the cost are very different
Table structure
CREATE TABLE ITEMS ( N INTEGER NOT NULL, S TEXT NOT NULL ); INSERT INTO ITEMS SELECT (random()*1000000)::integer AS n, md5(random()::text) AS s FROM generate_series(1,1000000); CREATE INDEX N_INDEX ON ITEMS(N);
Performing JOIN
with GROUP BY
in subquery without LATERAL
EXPLAIN SELECT I.* FROM ITEMS I INNER JOIN ( SELECT COUNT(1), n FROM ITEMS GROUP BY N ) I2 ON I2.N = I.N WHERE I.N IN (243477, 997947);
The results
Merge Join (cost=0.87..637500.40 rows=23 width=37) Merge Cond: (i.n = items.n) -> Index Scan using n_index on items i (cost=0.43..101.28 rows=23 width=37) Index Cond: (n = ANY ('{243477,997947}'::integer[])) -> GroupAggregate (cost=0.43..626631.11 rows=861418 width=12) Group Key: items.n -> Index Only Scan using n_index on items (cost=0.43..593016.93 rows=10000000 width=4)
Using LATERAL
EXPLAIN SELECT I.* FROM ITEMS I INNER JOIN LATERAL ( SELECT COUNT(1), n FROM ITEMS WHERE N = I.N GROUP BY N ) I2 ON 1=1 --I2.N = I.N WHERE I.N IN (243477, 997947);
Results
Nested Loop (cost=9.49..1319.97 rows=276 width=37) -> Bitmap Heap Scan on items i (cost=9.06..100.20 rows=23 width=37) Recheck Cond: (n = ANY ('{243477,997947}'::integer[])) -> Bitmap Index Scan on n_index (cost=0.00..9.05 rows=23 width=0) Index Cond: (n = ANY ('{243477,997947}'::integer[])) -> GroupAggregate (cost=0.43..52.79 rows=12 width=12) Group Key: items.n -> Index Only Scan using n_index on items (cost=0.43..52.64 rows=12 width=4) Index Cond: (n = i.n)
My Postgres version is PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With