Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow query on "UNION ALL" view

I have a DB view which basically consists of two SELECT queries with UNION ALL, like this:

CREATE VIEW v AS
SELECT time, etc. FROM t1 // #1...
UNION ALL
SELECT time, etc. FROM t2 // #2...

The problem is that selects of the form

SELECT ... FROM v WHERE time >= ... AND time < ...

perform really really slow on it.

Both SELECT #1 and #2 are decently fast, properly indexed and so on: when I create views v1 and v2 like:

CREATE VIEW v1 AS
SELECT time, etc. FROM t1 // #1...

CREATE VIEW v2 AS
SELECT time, etc. FROM t2 // #2...

And the same SELECT, with same WHERE condition as the above works OK on them individually.

Any ideas about where might be the problem and how to solve it?

(Just to mention, it's one of the recent Postgres versions.)

Edit: Adding anonymized query plans (thaks to @filiprem for the link to an awesome tool):

v1:

Aggregate  (cost=9825.510..9825.520 rows=1 width=53) (actual time=59.995..59.995 rows=1 loops=1)
  ->  Index Scan using delta on echo alpha  (cost=0.000..9815.880 rows=3850 width=53) (actual time=0.039..53.418 rows=33122 loops=1)
          Index Cond: (("juliet" >= 'seven'::uniform bravo_victor oscar whiskey) AND ("juliet" <= 'november'::uniform bravo_victor oscar whiskey))
          Filter: ((NOT victor) AND ((bravo_sierra five NULL) OR ((bravo_sierra)::golf <> 'india'::golf)))

v2:

Aggregate  (cost=15.470..15.480 rows=1 width=33) (actual time=0.231..0.231 rows=1 loops=1)
  ->  Index Scan using yankee on six charlie  (cost=0.000..15.220 rows=99 width=33) (actual time=0.035..0.186 rows=140 loops=1)
          Index Cond: (("juliet" >= 'seven'::uniform bravo oscar whiskey) AND ("juliet" <= 'november'::uniform bravo oscar whiskey))
          Filter: (NOT victor)

v:

Aggregate  (cost=47181.850..47181.860 rows=1 width=0) (actual time=37317.291..37317.291 rows=1 loops=1)
  ->  Append  (cost=42.170..47132.480 rows=3949 width=97) (actual time=1.277..37304.453 rows=33262 loops=1)
        ->  Nested Loop Left Join  (cost=42.170..47052.250 rows=3850 width=99) (actual time=1.275..37288.465 rows=33122 loops=1)
              ->  Hash Left Join  (cost=42.170..9910.990 rows=3850 width=115) (actual time=1.123..117.797 rows=33122 loops=1)
                      Hash Cond: ((alpha_seven.two)::golf = (quebec_three.two)::golf)
                    ->  Index Scan using delta on echo alpha_seven  (cost=0.000..9815.880 rows=3850 width=132) (actual time=0.038..77.866 rows=33122 loops=1)
                            Index Cond: (("juliet" >= 'seven'::uniform bravo_victor oscar whiskey_two) AND ("juliet" <= 'november'::uniform bravo_victor oscar whiskey_two))
                            Filter: ((NOT victor) AND ((bravo_sierra five NULL) OR ((bravo_sierra)::golf <> 'india'::golf)))
                    ->  Hash  (cost=30.410..30.410 rows=941 width=49) (actual time=1.068..1.068 rows=941 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 75kB
                          ->  Seq Scan on alpha_india quebec_three  (cost=0.000..30.410 rows=941 width=49) (actual time=0.010..0.486 rows=941 loops=1)
              ->  Index Scan using mike on hotel quebec_sierra  (cost=0.000..9.630 rows=1 width=24) (actual time=1.112..1.119 rows=1 loops=33122)
                      Index Cond: ((alpha_seven.zulu)::golf = (quebec_sierra.zulu)::golf)
        ->  Subquery Scan on "*SELECT* 2"  (cost=34.080..41.730 rows=99 width=38) (actual time=1.081..1.951 rows=140 loops=1)
              ->  Merge Right Join  (cost=34.080..40.740 rows=99 width=38) (actual time=1.080..1.872 rows=140 loops=1)
                      Merge Cond: ((quebec_three.two)::golf = (charlie.two)::golf)
                    ->  Index Scan using whiskey_golf on alpha_india quebec_three  (cost=0.000..174.220 rows=941 width=49) (actual time=0.017..0.122 rows=105 loops=1)
                    ->  Sort  (cost=18.500..18.750 rows=99 width=55) (actual time=0.915..0.952 rows=140 loops=1)
                            Sort Key: charlie.two
                            Sort Method:  quicksort  Memory: 44kB
                          ->  Index Scan using yankee on six charlie  (cost=0.000..15.220 rows=99 width=55) (actual time=0.022..0.175 rows=140 loops=1)
                                  Index Cond: (("juliet" >= 'seven'::uniform bravo_victor oscar whiskey_two) AND ("juliet" <= 'november'::uniform bravo_victor oscar whiskey_two))
                                  Filter: (NOT victor)

juliet is time.

like image 749
Mladen Jablanović Avatar asked Jan 27 '12 09:01

Mladen Jablanović


People also ask

Does UNION all take a long time?

Query alone takes about 3 seconds to execute (UNION took 4.5-5.5 seconds) Each part in seperate runs in seconds. The application does sorting and select on this view, which makes it processing time even larger - about 6 seconds when query is cached, about 12 seconds or more if data has changed.

Is UNION query slow?

The sql statement is a simple union all between two queries. Each one on its own is instantaneous. Union all them however and it becomes 20x slower.

How do you optimize a UNION all query?

All partitions in a UNION ALL view must have a complete set of indexes defined for optimization to work. Queries with DISTINCT will tend to run more slowly using a UNION ALL view than a base table.

Which is faster UNION or UNION all?

Both UNION and UNION ALL operators combine rows from result sets into a single result set. The UNION operator removes eliminate duplicate rows, whereas the UNION ALL operator does not. Because the UNION ALL operator does not remove duplicate rows, it runs faster than the UNION operator.


1 Answers

This seems to be a case of a pilot error. The "v" query plan selects from at least 5 different tables.

Now, Are You sure You are connected to the right database? Maybe there are some funky search_path settings? Maybe t1 and t2 are actually views (possibly in a different schema)? Maybe You are somehow selecting from the wrong view?

Edited after clarification:

You are using a quite new feature called "join removal" : http://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.0#Join_Removal

http://rhaas.blogspot.com/2010/06/why-join-removal-is-cool.html

It appears that the feature does not kick in when union all is involved. You probably have to rewrite the view using only the required two tables.

another edit: You appear to be using an aggregate (like "select count(*) from v" vs. "select * from v"), which could get vastly different plans in face of join removal. I guess we won't get very far without You posting the actual queries, view and table definitions and plans used...

like image 132
maniek Avatar answered Oct 16 '22 21:10

maniek