Table vs Temp Table Performance

2 Answers

In your situation we use a permanent table called a staging table. This is a common method with large imports. In fact we generally use two staging tables one with the raw data and one with the cleaned up data which makes researching issues with the feed easier (they are almost always a result of new and varied ways our clients find to send us junk data, but we have to be able to prove that). Plus you avoid issues like having to grow temp db or causing issues for other users who want to use temp db but have to wait while it grows for you, etc.

You can also use SSIS and skip the staging table(s), but I find the ability to go back and research without having to reload a 50,000,000 table is very helpful.

answered Sep 20 '22 14:09

HLGEM

If you don't use tempdb, make sure the recovery model of the database you are working in is not set to "Full". This will cause a lot of overhead on those 50M row inserts.

Ideally, you should use a staging database, simple recovery model, on RAID 10 if possible, and size it ahead of time to provide enough space for all your operations. Turn auto-grow off.

Use INSERT ... WITH (TABLOCK) to avoid row-level logging:

INSERT INTO StagingTable WITH (TABLOCK) (.....)
SELECT .....

Likewise for BULK INSERT. If you drop and recreate, create your clustered index prior to insert. If you can't, insert into one table first, then insert from that into another table with the right clustering, and truncate the first table. Avoid small batch sizes on BULK INSERT if possible. Read the BULK INSERT documentation closely, as you can sabotage performance with the wrong options.

Avoid INSERT ... EXEC. Every row is logged.

Avoid UPDATEs, unless you need to calculate running totals. Generally, it is cheaper to insert from one table into another, and then truncate the first table, than to update in place. Running total calculations are the exception, since they can be done with an UPDATE and variables to accumulate values between rows.

Avoid table variables for anything except control structures, since they prevent parallelization. Do not join your 50M row table to a table variable, use a temp table instead.

Don't be afraid of cursors for iteration. Use cursor variables, and declare them with the STATIC keyword against low-cardinality columns at the front of the clustered index. Use this to slice big tables into more manageable chunks.

Don't try to do too much in any one statement.

answered Sep 21 '22 14:09

Peter Radocchia

Related questions
                            
                                SQL decimal equivalent in .NET
                            
                                Is using cfsqltype good practice?
                            
                                Varchar(max) column not allowed to be a Primary Key in SQL Server
                            
                                Dynamically load information to Twitter Bootstrap modal
                            
                                Operand data type time is invalid for avg operator...?
                            
                                store users and pass in single table or separate table
                            
                                What’s the difference between a primary key and a clustered index? [duplicate]
                            
                                Sql insert if row does not exist
                            
                                How to get last week date range based on current date in sql?
                            
                                MySQL COUNT(CASE WHEN ... THEN DISTINCT Column)
                            
                                Hibernate vs JDBI [closed]
                            
                                Update table using result of another query
                            
                                How to remove everything before a certain character in SQL Server?
                            
                                Subtract hours from SQL Server 2012 query result
                            
                                PostgreSQL: UPDATE using aggregate function
                            
                                Count rows in partition with Order By
                            
                                SQL query: how do I change a value according to a lookup table?
                            
                                Parse a date from unformatted text in SQL
                            
                                Cascade on Delete or use Triggers?
                            
                                Simple Linq question: How to select more than one column?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Table vs Temp Table Performance

Tags:

sql

sql-server

tsql

sql-server-2008

ManishKumar1980

People also ask

2 Answers

HLGEM

Peter Radocchia

Recent Activity

Donate For Us