Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SSIS Stored Procedure uses Temp Table 2008 and 2014

I'm currently writing an SSIS package that retrieves data from a stored procedure via an OLE DB Source. The stored procedure contains a rather nasty query that I've been able to improve with the use of temp tables. If I switch these temp tables to table variables, the logical reads jump from about 1.3 million to about 56 million. I'm uncomfortable enough with the 1.3 million, but there is no way that I can be satisfied with the 56 million logical reads. Thus, I can't really convert the temp tables to table variables.

However, SSIS (or rather SQL Server) can't resolve the metadata for this query, so the package won't run. I've found a few different solutions online, but none of them seem to work for both SQL Server 2008 and SQL Server 2014. We are currently in the process of upgrading all of our servers to 2014, and this particular package runs against 2008 in DEV, 2014 in QA, and 2008 in production currently. By the fall, the PROD tier will be 2014, and the DEV tier will be promoted sometime after that. Unfortunately, I can't wait until these upgrades happen to release this SSIS package. The data needs to start moving by next week. Thus, I need to figure out a way to get the metadata resolved for both environments. Here's what I've tried so far:

  1. Add a dummy select in an IF 1=0 block which returns the proper metadata. This works in 2008, but not 2014.

  2. Use SET FMTONLY OFF at the beginning of the stored procedure. This works in 2008, but not 2014. Furthermore, it causes the stored procedure to run once for each column returned (over 30 in this case), which is a deal-breaker even if it did work.

  3. Use EXEC ... WITH RESULT SETS (( ... ));. This works in 2014, but not in 2008.

  4. Deploy a stored procedure which returns the proper metadata, build and deploy the SSIS package, then modify the stored procedure to the proper version. This hasn't seemed to work in either environment, and this would complicate of any other ETL applications developed within our ETL framework.

If I can't figure anything out, I could either deploy different stored procedures and packages to the different tiers, but I would very much prefer against this. For one, this would complicate future releases, and I would also need to ensure that I don't forget about updating the stored procedure and package once we upgrade the servers.

I could also make real tables in the database which would take the place of these temp tables. I don't really like this solution, but it's something that I could tolerate. If I end up doing this, I would probably switch to using the WITH RESULT SETS in the future.

However, I personally don't care much for either of these solutions, so I was wondering if there is any workaround that I missed that might work a bit better.

like image 803
Nelson O Avatar asked Aug 06 '15 14:08

Nelson O


People also ask

Can you use temp tables in stored procedures?

You can create and use temporary tables in a stored procedure, but the temporary table exists only for the duration of the stored procedure that creates it. When the procedure completes, Adaptive Server automatically drops the temporary table.

Can we use local temp tables in SSIS?

Use delayed validation. If you absolutely must use temp tables in your SSIS package, turn on the Delay Validation setting on the data flow tasks that use temp tables.

What stored procedure is used to run an SSIS package?

In order to execute the SSIS package from the catalog, we have to execute at least two stored procedures from the SSISDB database: create_execution and start_execution. In create_execution we specify the details need to identify the package to be executed (folder name, project name and package name).


1 Answers

Despite your reluctance, I think you've made the right choice and a dedicated staging area is the right way to go. Most of the production ETLs I've worked with have a dedicated staging database, never mind tables. You then have the benefit of being able to control the storage more explicitly, which makes performance more reliable and the whole thing generally more maintainable. For example, you can create a dedicated contiguous block of fast disk space for these tables with their own file group etc. I'd certainly rather see 2 separate SPs relying on a few physical tables than a really gnarly single one.

That said, without knowing any specifics this is just my experience, so a caveat for future readers: As with all things database, be sure to measure the actual performance of your scenario (before and after) rather than making any assumptions based on the query plan - it might be misleading you.

like image 170
Matt Avatar answered Oct 11 '22 03:10

Matt