Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL CTE vs Temp Table

I am running into a bit of a stumper here. My main goal is to be able to use this in C# and Entity Framework and our directive from on high is to stay away from stored procedures.

I have 2 tables: a xref and a (Celko) tree table.

/**
 ** Table [dbo].[EntityXref]
 **/
IF EXISTS(SELECT * FROM sys.tables WHERE name = N'EntityXref' AND type = N'U')
    DROP TABLE [dbo].[EntityXref]
GO
CREATE TABLE dbo.[EntityXref]
( Id                BIGINT IDENTITY(1,1) NOT NULL
, EntityId          INT NOT NULL
, EntityTypeId      INT NOT NULL
, ChildEntityId     INT NOT NULL
, ChildEntityTypeId INT NOT NULL
, CONSTRAINT [PK_EntityXref] PRIMARY KEY NONCLUSTERED ([Id] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
, CONSTRAINT [UQ_EntityXref] UNIQUE CLUSTERED (EntityId, EntityTypeId, ChildEntityId, ChildEntityTypeId)
)


/**
 ** Table [dbo].[EntityTree]
 **/
IF EXISTS(SELECT * FROM sys.tables WHERE name = N'EntityTree' AND type = N'U')
    DROP TABLE dbo.EntityTree
GO
CREATE TABLE dbo.EntityTree
( Id            BIGINT IDENTITY(1,1) NOT NULL
, SystemId      INT NOT NULL DEFAULT 1 
, EntityId      INT NOT NULL -- could be an AgencyId, UserId, ClientId, VendorId, etc
, EntityTypeId  INT NOT NULL -- Defines the entity type
, isActive      BIT NOT NULL
, lft           BIGINT NOT NULL 
, rgt           BIGINT NOT NULL 
, CONSTRAINT [PK_EntityTree] PRIMARY KEY CLUSTERED ([Id] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
, CONSTRAINT [UQ_Entity]     UNIQUE NONCLUSTERED (EntityId, EntityTypeId)
, CONSTRAINT [UQ_Left]       UNIQUE NONCLUSTERED ([lft])
, CONSTRAINT [UQ_LeftRight]  UNIQUE NONCLUSTERED ([lft], [rgt])
)
GO

The basic tree data looks like:

Customer -> Agencies -> Users -> Clients

We also have Users who manage multiple agencies, hence the xref (poor name) table. I am testing with a single user that has overview of 98% of the agencies and I need all of the clients. So, my conundrum:

NOTE:

  • EntityTypeId = 7 --> User account used for report generation
  • EntityTypeId = 4 --> Client account

This takes 4 seconds to run but cannot be expressed as a view:

DECLARE @t TABLE
( childentityid INT
, childentitytypeid INT
)

INSERT INTO @t
SELECT et.RootEntityId, et.RootEntityTypeId
  FROM dbo.EntityXref et
 WHERE et.EntityId = 17088 AND et.EntityTypeId = 7    

SELECT * 
  FROM @t a
 INNER JOIN dbo.GetMyCaseLoad b ON a.RootEntityId = b.ParentEntityId AND a.RootEntityTypeId = b.ParentEntityTypeId
GO

This takes 36-40 seconds to run (several different permutations on this join!)

WITH xrefParent (parentEntityId, parentEntityTypeId)    --, rootEntityId, rootEntityTypeId)
  AS (SELECT ChildEntityId, ChildEntityTypeId /*, EntityId, EntityTypeId */ FROM dbo.EntityXref WHERE EntityId = 17088 AND EntityTypeId = 7) 
SELECT *
  FROM GetMyCaseLoad cl
 INNER JOIN xrefParent p ON cl.ParentEntityId = p.parentEntityId AND cl.ParentEntityTypeId = p.parentEntityTypeId
-- WHERE p.rootEntityId = 17088 AND p.rootEntityTypeId = 7
GO

Any ideas on how to get the benefits of the temp table into a view for consumption by Entity Framework?


Added definitions:

CREATE VIEW GetMyCaseLoad AS
SELECT Parent.Id            [ParentRecordId]
     , Parent.EntityId      [ParentEntityId]
     , Parent.EntityTypeId  [ParentEntityTypeId]
     , Child.SystemId       [ChildSystemId]
     , Child.Id             [ChildRecordId]
     , Child.EntityId       [ChildEntityId]
     , Child.EntityTypeId   [ChildEntityTypeId]
     , Child.isActive       [ChildIsActive]
     , Child.lft            [ChildLeft]
     , Child.rgt            [ChildRight]
  FROM dbo.EntityTree Parent
     , dbo.EntityTree Child
 WHERE Child.lft > Parent.lft
   AND Child.rgt < Parent.rgt
   AND Child.EntityTypeId = 4
GO


CREATE VIEW GetMyFullCaseLoad AS
SELECT x.Id             [XrefRecordId]
     , x.EntityId       [XrefParentEntityId]
     , x.EntityTypeId   [XrefParentEntityTypeId]
     , c.ParentRecordId
     , c.ParentEntityId
     , c.ParentEntityTypeId
     , c.ChildRecordId
     , c.ChildEntityId
     , c.ChildEntityTypeId
     , c.ChildIsActive
     , c.ChildLeft
     , c.ChildRight
     , x.CanRead
     , x.CanWrite 
  FROM EntityXref x
 INNER JOIN dbo.GetMyCaseLoad c ON x.ChildEntityId = c.ParentEntityId AND x.ChildEntityTypeId = c.ParentEntityTypeId
GO

The 2nd view is what we are trying to speed up.

SIDE NOTE: The current system takes about 2-3 minutes to bring back records. The 2nd view or CTE does it in 40 seconds based on a new data structure (adjacency tree vs set tree). With the temp table 4 seconds.

like image 898
Keith Barrows Avatar asked Jan 26 '16 18:01

Keith Barrows


People also ask

Is CTE better than temp table?

Looking at SQL Profiler results from these queries (each were run 10 times and averages are below) we can see that the CTE just slightly outperforms both the temporary table and table variable queries when it comes to overall duration.

When to use temp table and CTE is SQL?

As far as when to use each, they have very different use cases. If you will have a very large result set, or need to refer to it more than once, put it in a #temp table. If it needs to be recursive, is disposable, or is just to simplify something logically, a CTE is preferred.

What is difference between temp table and table variable and CTE?

Scope wise the local temp table is available only in the current session. The global temp tables are available for all the sessions or the SQL Server connections. The scope of the table variable is just within the batch or a view or a stored procedure. The scope of the CTE is limited to the statement which follows it.

Is CTE faster than subquery?

The performance of CTEs and subqueries should, in theory, be the same since both provide the same information to the query optimizer. One difference is that a CTE used more than once could be easily identified and calculated once. The results could then be stored and read multiple times.


1 Answers

The problem with CTE's is that they are not materialized, they do not have dedicated statistics (they rely on statistics of the underlying objects), they don't have indexes (although in some cases they can use indexes on referenced tables).

The upsides of temporary tables is that they are inherently materialized (in tempdb), they can have indexes (if you define them) and most definitely have dedicated statistics.

In a lot of cases that means that using temporary tables instead of CTEs can produce better execution plans. Using a CTE will almost never speed up things, while a temporary table in a lot of cases will.

I will defer to a higher authority than me and leave you with his quote:

A CTE should never be used for performance. You will almost never speed things up by using a CTE because it's just a disposable view. You can do some neat things with them but speeding up a query isn't really one of them.

This quote is from the accepted answer to the question "What's the difference between a CTE and a Temp Table?"

PS: I see that you are using a TABLE variable in your first query. This is not the same as a temporary table. A temporary table will almost always beat a TABLE variable in terms of performance. For a good essay on the difference between TABLE variables and temporary tables, read this accepted answer on the question "What's the difference between a temp table and table variable in SQL Server?".

like image 157
TT. Avatar answered Oct 19 '22 23:10

TT.