Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logarithmically increasing execution time for each loop of a ForEach control

First, some background, I’m an SSIS newbie and I’ve just completed my second data-import project.

The package is very simple and consists of a dataflow that imports a tab-separated customer values file of ~30,000 records into an ADO recordset variable which in turn is used to power a ForEach Loop Container that executes a piece of SQL passing in values from each row of the recordset.

The import of the first ~21,000 records took 59 hours to accomplish, prior to it failing! The last ~9,000 took a further 8 hours. Yes, 67 hours in total!

The SQL consists of a check to determine if the record already exists, a call to a procedure to generate a new password, and a final call to another procedure to insert the customer data into our system. The final procedure returns a recordset, but I’m disintersted in the result and so I have just ignored it. I don’t know whether SSIS discards the recordset or not. I am aware that this is the slowest possible way of getting the data into the system, but I did not expect it to be this slow, nor to fail two thirds of the way through, and again whilst processing the last ~9,000.

When I tested the a ~3,000 record subset on my local machine the Execute Package Utility reported that each insert was taking approximately 1 second. A bit of quick math and the suggestion was that the total import would take around 8 hours to run. Seemed like a long time, which I had expected given all that I had read about SSIS and RBAR execution. I figured that the final import would be a bit quicker as the server is considerably more powerful. Although I am accessing the server remotely, but I wouldn’t have expected this to be an issue, as I have performed imports in the past, using bespoke c# console applications that use simple ADO connections and have had nothing run anywhere near as slowly.

Initially the destination table wasn’t optimised for the existence check, and I thought this could be the cause of the slow performance. I added an appropriate index to the table to change the test from a scan to a seek, expecting that this would get rid of the performance issue. Bizarrely it seemed to have no visible effect!

The reason we use the sproc to insert the data into our system is for consistency. It represents the same route that the data takes if it is inserted into our system via our web front-end. The insertion of the data also causes a number of triggers to fire and update various other entities in the database.

What’s been occurring during this import though, and has me scratching my head, is that the execution time for the SQL batch, as reported by the output of the Execute Package Utility has been logarithmically increasing during the run. What starts out as a sub-one second execution time, ends up over the course of the import at greater than 20 seconds, and eventually the import package just simply ground to a complete halt.

I've searched all over the web multiple times, thanks Google, as well as StackOverflow, and haven’t found anything that describes these symptoms.

Hopefully someone out there has some clues.

Thanks

In response to ErikE: (I couldn’t fit this into a comment, so I've added it here.)

Erik. as per your request I ran the profiler over the database whilst running the three thousand item test file through it’s paces.

I wasn’t able to easily figure out how to get SSIS to insert a visible difference into the code that would be visible to the profiler, so I just ran the profiler for the whole run. I know there will be some overhead associated with this, but, theoretically, it should be more or less consistent over the run.

The duration on a per item basis remains pretty constant over the whole run.

Below is cropped output from the trace. In the run that I've done here the first 800 overlapped previously entered data, so the system was effectively doing no work (Yay indexes!). As soon as the index stopped being useful and the system was actually inserting new data, you can see the times jump accordingly, but they don’t seem to change much, if at all between the first and last elements, with the number of reads being the largest item.

------------------------------------------
| Item | CPU | Reads | Writes | Duration |
------------------------------------------
| 0001 |   0 |    29 |    0   |     0    |
| 0002 |   0 |    32 |    0   |     0    |
| 0003 |   0 |    27 |    0   |     0    |
|…                                       |
| 0799 |   0 |    32 |    0   |     0    |
| 0800 |  78 |  4073 |   40   |   124    |
| 0801 |  32 |  2122 |    4   |    54    |
| 0802 |  46 |  2128 |    8   |   174    |
| 0803 |  46 |  2128 |    8   |   174    |
| 0804 |  47 |  2131 |   15   |   242    |
|…                                       |
| 1400 |  16 |  2156 |    1   |    54    |
| 1401 |  16 |  2167 |    3   |    72    |
| 1402 |  16 |  2153 |    4   |    84    |
|…                                       |
| 2997 |  31 |  2193 |    2   |    72    |
| 2998 |  31 |  2195 |    2   |    48    |
| 2999 |  31 |  2184 |    2   |    35    |
| 3000 |  31 |  2180 |    2   |    53    |
------------------------------------------

Overnight I've also put the system through a full re-run of the import with the profiler switched on to see how things feared. It managed to get through 1 third of the import in 15.5 hours on my local machine. I exported the trace data to a SQL table so that I could get some statistics from it. Looking at the data in the trace, the delta between inserts increases by ~1 second per thousand records processed, so by the time it’s reached record 10,000 it’s taking 10 seconds per record to perform the insert. The actual code being executed for each record is below. Don’t bother critiquing the procedure, the SQL was written by the self-taught developer who was originally our receptionist long before anyone with actual developer education was employed by the company. We are well aware that it’s not good. The main thing is that I believe it should execute at a constant rate, and it very obviously doesn’t.

if not exists
(
    select 1
    from [dbo].[tblSubscriber]
    where strSubscriberEmail = @EmailAddress
    and ProductId = @ProductId
    and strTrialSource = @Source
)
begin
    declare @ThePassword varchar(20)
    select @ThePassword = [dbo].[DefaultPassword]()

    exec [dbo].[MemberLookupTransitionCDS5] 
     @ProductId
    ,@EmailAddress
    ,@ThePassword
    ,NULL --IP Address
    ,NULL --BrowserName
    ,NULL --BrowserVersion
    ,2 --blnUpdate
    ,@FirstName --strFirstName
    ,@Surname --strLastName
    ,@Source --strTrialSource
    ,@Comments --strTrialComments
    ,@Phone --strSubscriberPhone
    ,@TrialType --intTrialType
    ,NULL --Redundant MonitorGroupID
    ,NULL --strTrialFirstPage
    ,NULL --strTrialRefererUrl
    ,30 --intTrialSubscriptionDaysLength
    ,0 --SourceCategoryId
end
GO

Results of determining the difference in time between each execution (cropped for brevity).

----------------------
|  Row  | Delta (ms) |
----------------------
|   500 |  510       |
|  1000 |  976       |
|  1500 | 1436       |
|  2000 | 1916       |
|  2500 | 2336       |
|  3000 | 2816       |
|  3500 | 3263       |
|  4000 | 3726       |
|  4500 | 4163       |
|  5000 | 4633       |
|  5500 | 5223       |
|  6000 | 5563       |
|  6500 | 6053       |
|  7000 | 6510       |
|  7500 | 6926       |
|  8000 | 7393       |
|  8500 | 7846       |
|  9000 | 8503       |
|  9500 | 8820       |
| 10000 | 9296       |
| 10500 | 9750       |
----------------------
like image 555
Aspermont Avatar asked Jul 20 '12 06:07

Aspermont


1 Answers

Let's take some steps:

  1. Advice: Isolate if it is a server issue or a client one. Run a trace and see how long the first insert takes compared to the 3000th. Include in the SQL statements some difference on the 1st and 3000th iteration that can be filtered for in the trace so it is not capturing the other events. Try to avoid statement completion--use batch or RPC completion.

    Response: The recorded CPU, reads, and duration from your profiler trace are not increasing, but the actual elapsed/effective insert time is.

  2. Advice: Assuming that the above pattern holds true through the 10,000th insert (please advise if different), my best guess is that some blocking is occurring, maybe something like a constraint validation that is doing a nested loop join, which would scale logarithmically with the number of rows in the table just as you are seeing. Would you please do the following:

    • Provide the full execution plan of the INSERT statement using SET SHOWPLAN_TEXT ON.
    • Run a trace on the Blocked Process Report event and report on anything interesting.
    • Read Eliminating Deadlocks Caused by Foreign Keys with Large Transactions and let me know if this might be the cause or if I am barking up the wrong tree.

    If none of this makes progress on the problem, simply update your question with any new information and comment here, and I'll continue to do my best to help.

like image 122
ErikE Avatar answered Oct 22 '22 13:10

ErikE