Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can one efficiently LEFT OUTER JOIN a subset of the left table's rows in Postgres?

Let's say I have the following tables:

table_1                  table_2
id_a    name             id_a    id_b
1       c                1       1
2       a                1       2
3       b                2       1
                         2       2

Now consider the following LEFT OUTER JOIN:

SELECT *
FROM table_1
LEFT OUTER JOIN table_2 USING (id_a)

id_a    name  id_b  
1       c     1
1       c     2
2       a     1
2       a     2
3       b

Now imagine that 'FROM table_1' is actually a complex sub-query, like:

SELECT * FROM huge_table WHERE expensive_conditions_producing_three_rows

Is it possible to write a query that only joins against the left row with the minimum name, without re-running the sub-query entirely? You can assume that you have some control over the sub-query, i.e. you could add an ORDER BY if necessary.

In other words, the end result should look like this:

id_a    name  id_b
1       c
2       a     1
2       a     2
3       b

I considered using SELECT INTO to place the sub-query results in a temporary table. Then it wouldn't be a problem to compute the minimum for use in a JOIN ON condition. But I'd prefer to avoid this unless it's the only solution.

Edit: I'll wait a couple of days and then accept the best solution, regardless of PG version. One that works in PG 8.3 and earlier would be greatly appreciated, though.

like image 409
DNS Avatar asked Jan 18 '23 14:01

DNS


2 Answers

Using Window functions (available from PostgreSQL 8.4):

SELECT *
FROM
      ( SELECT *
             , ROW_NUMBER() OVER (ORDER BY SomeColumn) AS RowNum
        FROM table_1
      ) AS a
  LEFT JOIN
      table_2 AS b
    ON 
       (join condition)
    AND
       a.RowNum = 1
like image 127
ypercubeᵀᴹ Avatar answered Jan 30 '23 07:01

ypercubeᵀᴹ


Use a CTE (common table expression) for that (available for PostgreSQL 8.4 or later):

WITH cte AS (
    SELECT id_a, name
    FROM   table_1
    WHERE  expensive_conditions_producing_three_rows
    )
SELECT c.id_a, c.name, t2.id_b
FROM   cte c
LEFT   JOIN table2 t2 ON t2.id_a = c.id_a
                     AND t2.name = (SELECT min(name) FROM cte)
like image 41
Erwin Brandstetter Avatar answered Jan 30 '23 06:01

Erwin Brandstetter