I'm having a performance problem. I created a table that receives data from a file, I do a <code>BULK INSERT</code>. Then I do a <code>SELECT</code> with multiple <code>INNER JOIN</code>s (11 inner joins) to insert into another table with the right data. When I run this <code>SELECT</code>, it takes too long (more than a hour) and then I stop it. My solution was to break this query into 3, creating <code>@temp</code> tables. To my surprise, that takes 3 minutes. That's what I'm trying to understand, WHY breaking my query into 3 was FASTER than one select statement. Here is my query: <pre class="prettyprint"><code>SELECT t1.ReturnINT, t1.ReturnBIT, t2.ReturnINT, t3.ReturnINT, t5.ReturnINT, t1.ReturnDateTime FROM t1 INNER JOIN t2 ON t2.my_column_varchar = t1.my_column_varchar INNER JOIN t3 ON t3.my_column_number = t1.my_column_number AND t2.my_column_ID = t3.my_column_ID INNER JOIN t4 ON t4.my_column_varchar = t1.my_column_varchar INNER JOIN t5 ON t5.my_column_int = t1.my_column_int AND t5.my_column_int = t4.my_column_int AND t2.my_column_int = t5.my_column_int INNER JOIN t6 ON t6.my_column_int = t5.my_column_int AND t6.my_column_int = t2.my_column_int INNER JOIN t7 ON t7.my_column_int = t6.my_column_int INNER JOIN t8 ON t8.my_column_int = t3.my_column_int AND t8.my_column_datetime = t1.my_column_datetime INNER JOIN t9 ON t9.my_column_int = t3.my_column_int AND t8.my_column_datetime BETWEEN t9.my_column_datetime1 AND t9.datetime1 + t9.my_column_datetime2 INNER JOIN t10 ON t10.my_column_int = t9.my_column_int AND t10.my_column_int = t6.my_column_int INNER JOIN t11 ON t11.my_column_int = t9.my_column_int AND t8.my_column_datetime = t11.my_column_datetime </code></pre> ----EDITED---- There is NO where clause, my query is exactly as I put here. Here is my broken querys, i forget to put them here. It runs in 3 minutes. <pre class="prettyprint"><code>DECLARE @temp TABLE ( <Some_columns> ) INSERT INTO @temp SELECT <My_Linked_Columns> FROM t1 INNER JOIN t2 ON t2.my_column_varchar = t1.my_column_varchar INNER JOIN t3 ON t3.my_column_number = t1.my_column_number AND t2.my_column_ID = t3.my_column_ID INNER JOIN t4 ON t4.my_column_varchar = t1.my_column_varchar INNER JOIN t5 ON t5.my_column_int = t1.my_column_int AND t5.my_column_int = t4.my_column_int AND t2.my_column_int = t5.my_column_int DECLARE @temp2 TABLE( <Some_Columns> ) INSERT INTO @temp2 SELECT <More_Linked_Columns> FROM @temp as temp INNER JOIN t6 ON t6.my_column_int = temp.my_column_int AND t6.my_column_int = temp.my_column_int INNER JOIN t7 ON t7.my_column_int = t6.my_column_int INNER JOIN t8 ON t8.my_column_int = temp.my_column_int AND t8.my_column_datetime = temp.my_column_datetime DECLARE @temp3 TABLE( <Some_Columns> ) INSERT INTO @temp3 SELECT <More_Linked_Columns> FROM @temp2 AS temp2 INNER JOIN t9 ON t9.my_column_int = temp2.my_column_int AND temp2.my_column_datetime BETWEEN t9.my_column_datetime1 AND t9.datetime1 + t9.my_column_datetime2 INNER JOIN t10 ON t10.my_column_int = t9.my_column_int AND t10.my_column_int = temp2.my_column_int INNER JOIN t11 ON t11.my_column_int = t9.my_column_int AND temp2.my_column_datetime = t11.my_column_datetime SELECT <All_Final_Columns> FROM @temp3 </code></pre> ----EDITED 3---- Studying more things I discovered a problem in execution plan. I have a Nested Loop that estimates 1 row but it actually returns 1.204.014 rows. I guess the problem is exactly here, but I didn't find out how to solve this problem without breaking my query in 3 parts (Now I know why breaking it is faster hehehe)

Most common reasons: Reason 1: When two tables having n and m rows participating in <code>INNER JOIN</code> have many to many relationship, then the <code>INNER JOIN</code> can near a <code>CROSS JOIN</code> and can produce result set with more than MAX(n,m) rows, theoretically n x m rows are possible. Now imagine many such tables in <code>INNER JOIN</code>. This will result in the result set becoming bigger and bigger and will start eating into the allocated memory area. This could be a reason why temp tables might help you. Reason 2: You do not have <code>INDEX</code> built on the columns you are joining tables on. Reason 3: Do you have functions in <code>WHERE</code> clause?

Multiples INNER JOIN is too slow SQL SERVER

Tags:

performance

inner-join

sql-server

I'm having a performance problem.

I created a table that receives data from a file, I do a BULK INSERT. Then I do a SELECT with multiple INNER JOINs (11 inner joins) to insert into another table with the right data.

When I run this SELECT, it takes too long (more than a hour) and then I stop it. My solution was to break this query into 3, creating @temp tables. To my surprise, that takes 3 minutes. That's what I'm trying to understand, WHY breaking my query into 3 was FASTER than one select statement. Here is my query:

SELECT t1.ReturnINT, t1.ReturnBIT, t2.ReturnINT, t3.ReturnINT, t5.ReturnINT, t1.ReturnDateTime
FROM t1
INNER JOIN t2
    ON t2.my_column_varchar = t1.my_column_varchar
INNER JOIN t3
    ON t3.my_column_number = t1.my_column_number AND t2.my_column_ID = t3.my_column_ID
INNER JOIN t4
    ON t4.my_column_varchar = t1.my_column_varchar
INNER JOIN t5
    ON t5.my_column_int = t1.my_column_int AND t5.my_column_int = t4.my_column_int AND t2.my_column_int = t5.my_column_int
INNER JOIN t6
    ON t6.my_column_int = t5.my_column_int AND t6.my_column_int = t2.my_column_int
INNER JOIN t7
    ON t7.my_column_int = t6.my_column_int
INNER JOIN t8
    ON t8.my_column_int = t3.my_column_int AND t8.my_column_datetime = t1.my_column_datetime
INNER JOIN t9
    ON t9.my_column_int = t3.my_column_int AND t8.my_column_datetime BETWEEN t9.my_column_datetime1 AND t9.datetime1 + t9.my_column_datetime2
INNER JOIN t10
    ON t10.my_column_int = t9.my_column_int AND t10.my_column_int = t6.my_column_int
INNER JOIN t11
    ON t11.my_column_int = t9.my_column_int AND t8.my_column_datetime = t11.my_column_datetime

----EDITED----

There is NO where clause, my query is exactly as I put here.

Here is my broken querys, i forget to put them here. It runs in 3 minutes.

DECLARE @temp TABLE (
    <Some_columns>
)
INSERT INTO @temp
    SELECT <My_Linked_Columns>
    FROM t1
    INNER JOIN t2
        ON t2.my_column_varchar = t1.my_column_varchar
    INNER JOIN t3
        ON t3.my_column_number = t1.my_column_number AND t2.my_column_ID = t3.my_column_ID
    INNER JOIN t4
        ON t4.my_column_varchar = t1.my_column_varchar
    INNER JOIN t5
        ON t5.my_column_int = t1.my_column_int AND t5.my_column_int = t4.my_column_int AND t2.my_column_int = t5.my_column_int


DECLARE @temp2 TABLE(
    <Some_Columns>
)
INSERT INTO @temp2
    SELECT <More_Linked_Columns>
    FROM @temp as temp
    INNER JOIN t6
        ON t6.my_column_int = temp.my_column_int AND t6.my_column_int = temp.my_column_int
    INNER JOIN t7
        ON t7.my_column_int = t6.my_column_int
    INNER JOIN t8
        ON t8.my_column_int = temp.my_column_int AND t8.my_column_datetime = temp.my_column_datetime


DECLARE @temp3 TABLE(
    <Some_Columns>
)
INSERT INTO @temp3
    SELECT <More_Linked_Columns>
    FROM @temp2 AS temp2
    INNER JOIN t9
        ON t9.my_column_int = temp2.my_column_int AND temp2.my_column_datetime BETWEEN t9.my_column_datetime1 AND t9.datetime1 + t9.my_column_datetime2
    INNER JOIN t10
        ON t10.my_column_int = t9.my_column_int AND t10.my_column_int = temp2.my_column_int
    INNER JOIN t11
        ON t11.my_column_int = t9.my_column_int AND temp2.my_column_datetime = t11.my_column_datetime


SELECT <All_Final_Columns>
FROM @temp3

----EDITED 3----

Studying more things I discovered a problem in execution plan. I have a Nested Loop that estimates 1 row but it actually returns 1.204.014 rows. I guess the problem is exactly here, but I didn't find out how to solve this problem without breaking my query in 3 parts (Now I know why breaking it is faster hehehe)

540

asked Aug 07 '15 17:08

Alexandre_Almeida

1 Answers

Most common reasons:

Reason 1: When two tables having n and m rows participating in INNER JOIN have many to many relationship, then the INNER JOIN can near a CROSS JOIN and can produce result set with more than MAX(n,m) rows, theoretically n x m rows are possible.

Now imagine many such tables in INNER JOIN.

This will result in the result set becoming bigger and bigger and will start eating into the allocated memory area.

This could be a reason why temp tables might help you.

Reason 2: You do not have INDEX built on the columns you are joining tables on.

Reason 3: Do you have functions in WHERE clause?

answered Sep 19 '22 06:09

DhruvJoshi

Related questions
                            
                                SSIS: Cannot create an OLE DB accessor. Verify that the column metadata is valid
                            
                                Why is SQL Server 2012 faster than MongoDB for this query [closed]
                            
                                Can I declare a local variable not null?
                            
                                How to disable nesting of triggers at table or trigger level in SQLServer?
                            
                                Handle NULL value in UNPIVOT
                            
                                Backup SQL Server database using WITH FORMAT
                            
                                Use strong spatial types option in model designer locked?
                            
                                How to list all SSIS packages on the Sql Server 2008 using T-SQL
                            
                                Error: "Multiple columns are specified in an aggregated expression containing an outer reference."
                            
                                SQL Server query not matching for a varbinary type column
                            
                                Sql server using variable in pivot query
                            
                                Set EXECUTE sp_executesql result into a variable in sql
                            
                                adding a value to a column from data in next row sql
                            
                                Pass List of Integers to Stored Procedure
                            
                                SQL Server: Comparing against next X in group
                            
                                How to deal with Unicode replacement character � (0xFFFD / 65533) in SQL
                            
                                try_parse in SQL Server 2008
                            
                                SQL INSERT INTO WITH SELECT query
                            
                                DbContext and Connection pools
                            
                                Invoke-Sqlcmd : A network-related or instance-specific error occurred while establishing a connection to SQL Server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With