Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Start stored procedures sequentially or in parallel

We have a stored procedure that runs nightly that in turn kicks off a number of other procedures. Some of those procedures could logically be run in parallel with some of the others.

  • How can I indicate to SQL Server whether a procedure should be run in parallel or serial — ie: kicked off of asynchronously or blocking?
  • What would be the implications of running them in parallel, keeping in mind that I've already determined that the processes won't be competing for table access or locks- just total disk io and memory. For the most part they don't even use the same tables.
  • Does it matter if some of those procedures are the same procedure, just with different parameters?
  • If I start a pair or procedures asynchronously, is there a good system in SQL Server to then wait for both of them to finish, or do I need to have each of them set a flag somewhere and check and poll the flag periodically using WAITFOR DELAY?

At the moment we're still on SQL Server 2000.

As a side note, this matters because the main procedure is kicked off in response to the completion of a data dump into the server from a mainframe system. The mainframe dump takes all but about 2 hours each night, and we have no control over it. As a result, we're constantly trying to find ways to reduce processing times.

like image 207
Joel Coehoorn Avatar asked Dec 08 '08 15:12

Joel Coehoorn


2 Answers

I had to research this recently, so found this old question that was begging for a more complete answer. Just to be totally explicit: TSQL does not (by itself) have the ability to launch other TSQL operations asynchronously.

That doesn't mean you don't still have a lot of options (some of them mentioned in other answers):

  • Custom application: Write a simple custom app in the language of your choice, using asynchronous methods. Call a SQL stored proc on each application thread.
  • SQL Agent jobs: Create multiple SQL jobs, and start them asynchronously from your proc using sp_start_job. You can check to see if they have finished yet using the undocumented function xp_sqlagent_enum_jobs as described in this excellent article by Gregory A. Larsen. (Or have the jobs themselves update your own JOB_PROGRESS table as Chris suggests.) You would literally have to create separate job for each parallel process you anticipate running, even if they are running the same stored proc with different parameters.
  • OLE Automation: Use sp_oacreate and sp_oamethod to launch a new process calling the other stored proc as described in this article, also by Gregory A. Larsen.
  • DTS Package: Create a DTS or SSIS package with a simple branching task flow. DTS will launch tasks in individual spids.
  • Service Broker: If you are on SQL2005+, look into using Service Broker
  • CLR Parallel Execution: Use the CLR commands Parallel_AddSql and Parallel_Execute as described in this article by Alan Kaplan (SQL2005+ only).
  • Scheduled Windows Tasks: Listed for completeness, but I'm not a fan of this option.

I don't have much experience with Service Broker or CLR, so I can't comment on those options. If it were me, I'd probably use multiple Jobs in simpler scenarios, and a DTS/SSIS package in more complex scenarios.

One final comment: SQL already attempts to parallelize individual operations whenever it can*. This means that running 2 tasks at the same time instead of after each other is no guarantee that it will finish sooner. Test carefully to see whether it actually improves anything or not.

We had a developer that created a DTS package to run 8 tasks at the same time. Unfortunately, it was only a 4-CPU server :)

*Assuming default settings. This can be modified by altering the server's Maximum Degree of Parallelism or Affinity Mask, or by using the MAXDOP query hint.

like image 151
BradC Avatar answered Sep 28 '22 02:09

BradC


Create a couple of SQL Server agent jobs where each one runs a particular proc.

Then from within your master proc kick off the jobs.

The only way of waiting that I can think of is if you have a status table that each proc updates when it's finished.

Then yet another job could poll that table for total completion and kick off a final proc. Alternatively, you could have a trigger on this table.

The memory implications are completely up to your environment..

UPDATE: If you have access to the task system.. then you could take the same approach. Just have windows execute multiple tasks, each responsible for one proc. Then use a trigger on the status table to kick off something when all of the tasks have completed.

UPDATE2: Also, if you're willing to create a new app, you could house all of the logic in a single exe...

like image 27
NotMe Avatar answered Sep 28 '22 02:09

NotMe