Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Joining against views in SQLServer with strange query optimizer behavior

I have a complex view that I use to pull a list of primary keys that indicate rows in a table that have been modified between two temporal points.

This view has to query 13 related tables and look at a changelog table to determine if a entity is "dirty" or not.

Even with all of this going on, doing a simple query:

select * from vwDirtyEntities;

Takes only 2 seconds.

However, if I change it to

select
    e.Name
from 
    Entities e 
         inner join vwDirtyEntities de
             on e.Entity_ID = de.Entity_ID

This takes 1.5 minutes.

However, if I do this:

declare @dirtyEntities table
(
    Entity_id uniqueidentifier;
)

insert into @dirtyEntities 
   select * from vwDirtyEntities;


select
   e.Name
from 
    Entities e 
        inner join @dirtyEntities de
           on e.Entity_ID = de.Entity_ID

I get the same results in only 2 seconds.

This leads me to believe that SQLServer is evaluating the view per row when joined to Entities, instead of constructing a query plan that involves joining the single inner join above to the other joins in the view.

Note that I want to join against the full result set from this view, as it filters out only the keys I want internally.

I know I could make it into a materialized view, but this would involve schema binding the view and it's dependencies and I don't like the overhead maintaining the index would cause (This view is only queried for exports, while there are far more writes to the underlying tables).

So, aside from using a table variable to cache the view results, is there any way to tell SQL Server to cache the view while evaluating the join? I tried changing the join order (Select from the view and join against Entities), however that did not make any difference.

The view itself is also very efficient, and there is no room to optimize there.

like image 426
FlySwat Avatar asked Sep 29 '10 03:09

FlySwat


1 Answers

There is nothing magical about a view. It's a macro that expands. The optimiser decides when JOINed to expand the view into the main query.

I'll address other points in your post:

  • you have ruled out an indexed view. A view can only be a discrete entity when it is indexed

  • SQL Server will never do a RBAR query on it's own. Only developers can write loops.

  • there is no concept of caching: every query uses latest data unless you use temp tables

  • you insist on using the view which you've decided is very efficient. But have no idea how views are treated by the optimizer and it has 13 tables

  • SQL is declarative: join order usually does not matter

  • Many serious DB developer don't use views because of limitations like this: they are not reusable because they are macros

Edit, another possibility. Predicate pushing on SQL Server 2005. That is, SQL Server can not push the JOIN condition "deeper" into the view.

like image 117
gbn Avatar answered Oct 21 '22 13:10

gbn