Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will Postgres push down a WHERE clause into a VIEW with a Window Function (Aggregate)?

The docs for Pg's Window function say:

The rows considered by a window function are those of the "virtual table" produced by the query's FROM clause as filtered by its WHERE, GROUP BY, and HAVING clauses if any. For example, a row removed because it does not meet the WHERE condition is not seen by any window function. A query can contain multiple window functions that slice up the data in different ways by means of different OVER clauses, but they all act on the same collection of rows defined by this virtual table.

However, I'm not seeing this. It seems to me like the Select Filter is very near to the left margin and the top (last thing done).

=# EXPLAIN SELECT * FROM chrome_nvd.view_options where fkey_style = 303451;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Subquery Scan view_options  (cost=2098450.26..2142926.28 rows=14825 width=180)
   Filter: (view_options.fkey_style = 303451)
   ->  Sort  (cost=2098450.26..2105862.93 rows=2965068 width=189)
         Sort Key: o.sequence
         ->  WindowAgg  (cost=1446776.02..1506077.38 rows=2965068 width=189)
               ->  Sort  (cost=1446776.02..1454188.69 rows=2965068 width=189)
                     Sort Key: h.name, k.name
                     ->  WindowAgg  (cost=802514.45..854403.14 rows=2965068 width=189)
                           ->  Sort  (cost=802514.45..809927.12 rows=2965068 width=189)
                                 Sort Key: h.name
                                 ->  Hash Join  (cost=18.52..210141.57 rows=2965068 width=189)
                                       Hash Cond: (o.fkey_opt_header = h.id)
                                       ->  Hash Join  (cost=3.72..169357.09 rows=2965068 width=166)
                                             Hash Cond: (o.fkey_opt_kind = k.id)
                                             ->  Seq Scan on options o  (cost=0.00..128583.68 rows=2965068 width=156)
                                             ->  Hash  (cost=2.21..2.21 rows=121 width=18)
                                                   ->  Seq Scan on opt_kind k  (cost=0.00..2.21 rows=121 width=18)
                                       ->  Hash  (cost=8.80..8.80 rows=480 width=31)
                                             ->  Seq Scan on opt_header h  (cost=0.00..8.80 rows=480 width=31)
(19 rows)

These two WindowAgg's essentially change the plan to something that seems to never finish from the much faster

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan view_options  (cost=329.47..330.42 rows=76 width=164) (actual time=20.263..20.403 rows=42 loops=1)
   ->  Sort  (cost=329.47..329.66 rows=76 width=189) (actual time=20.258..20.300 rows=42 loops=1)
         Sort Key: o.sequence
         Sort Method:  quicksort  Memory: 35kB
         ->  Hash Join  (cost=18.52..327.10 rows=76 width=189) (actual time=19.427..19.961 rows=42 loops=1)
               Hash Cond: (o.fkey_opt_header = h.id)
               ->  Hash Join  (cost=3.72..311.25 rows=76 width=166) (actual time=17.679..18.085 rows=42 loops=1)
                     Hash Cond: (o.fkey_opt_kind = k.id)
                     ->  Index Scan using options_pkey on options o  (cost=0.00..306.48 rows=76 width=156) (actual time=17.152..17.410 rows=42 loops=1)
                           Index Cond: (fkey_style = 303451)
                     ->  Hash  (cost=2.21..2.21 rows=121 width=18) (actual time=0.432..0.432 rows=121 loops=1)
                           ->  Seq Scan on opt_kind k  (cost=0.00..2.21 rows=121 width=18) (actual time=0.042..0.196 rows=121 loops=1)
               ->  Hash  (cost=8.80..8.80 rows=480 width=31) (actual time=1.687..1.687 rows=480 loops=1)
                     ->  Seq Scan on opt_header h  (cost=0.00..8.80 rows=480 width=31) (actual time=0.030..0.748 rows=480 loops=1)
 Total runtime: 20.893 ms
(15 rows)

What is going on, and how do I fix it? I'm using Postgresql 8.4.8. Here is what the actual view is doing:

 SELECT o.fkey_style, h.name AS header, k.name AS kind
   , o.code, o.name AS option_name, o.description
     , count(*) OVER (PARTITION BY h.name) AS header_count
     , count(*) OVER (PARTITION BY h.name, k.name) AS header_kind_count
   FROM chrome_nvd.options o
   JOIN chrome_nvd.opt_header h ON h.id = o.fkey_opt_header
   JOIN chrome_nvd.opt_kind k ON k.id = o.fkey_opt_kind
  ORDER BY o.sequence;
like image 279
NO WAR WITH RUSSIA Avatar asked Sep 23 '11 19:09

NO WAR WITH RUSSIA


1 Answers

No, PostgreSQL will only push down a WHERE clause on a VIEW that does not have an Aggregate. (Window functions are consider Aggregates).

< x> I think that's just an implementation limitation

< EvanCarroll> x: I wonder what would have to be done to push the WHERE clause down in this case.

< EvanCarroll> the planner would have to know that the WindowAgg doesn't itself add selectivity and therefore it is safe to push the WHERE down?

< x> EvanCarroll; a lot of very complicated work with the planner, I'd presume

And,

< a> EvanCarroll: nope. a filter condition on a view applies to the output of the view and only gets pushed down if the view does not involve aggregates

like image 57
NO WAR WITH RUSSIA Avatar answered Oct 25 '22 05:10

NO WAR WITH RUSSIA