Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Notify Operator if ANY step in job fails

Can I (How do I) configure Sql Server 2008 to notify an operator if any step in the Job fails?

I have a Sql Server job with several steps to update data from multiple different sources, followed by one final step which performs several calculations on the data. All of the "data refresh" steps are set to "Go to next step on failure". Generally speaking, if one of the data refreshes fails, I still want the final step to run, but I still want to be notified about the intermediate failures, so if they fail consistantly, I can investigate.

like image 907
Matt Murrell Avatar asked Oct 04 '10 19:10

Matt Murrell


People also ask

How can I tell if a SQL Server job failed?

To view these logs, perform the following steps: Open SQL Server Management Studio (SSMS) and connect to the corresponding database instance. Navigate to Management -> SQL Server Logs -> SQL job name. Find the job failure event and review the log details.

How do I find failed SQL Agent jobs in the last 24 hours?

[ExecutionStatus] = [FailedJobs]. [ExecutionStatus]; And that should tell you all jobs that have not succeeded since the last time they were run for jobs that have been run in the past 24 hours...

How do I fix failed jobs in SQL Server?

To resolve the problem, follow these steps: Set the SQL Server Agent service account in SQL Server Configuration Manager to the LocalSystem account. Stop and then start the SQL Server Agent service. Reset the SQL Server Agent service account in SQL Server Configuration Manager back to the original account.

How do I monitor SQL Agent jobs?

To open the Job Activity Monitor, expand SQL Server Agent in Management Studio Object Explorer, right-click Job Activity Monitor, and click View Job Activity. You can also view job activity for the current session by using the stored procedure sp_help_jobactivity.


2 Answers

Here is how we do it. We add one last T-SQL step (usually called "check steps") with this

SELECT  step_name, message
FROM    msdb.dbo.sysjobhistory
WHERE   instance_id > COALESCE((SELECT MAX(instance_id) FROM msdb.dbo.sysjobhistory
                                WHERE job_id = $(ESCAPE_SQUOTE(JOBID)) AND step_id = 0), 0)
        AND job_id = $(ESCAPE_SQUOTE(JOBID))
        AND run_status <> 1 -- success

IF      @@ROWCOUNT <> 0
        RAISERROR('Ooops', 16, 1)

Notice that this code is using tokens in job steps (the $(...) part), so code can't be executed in SSMS as is. It basicly tries to find entries of previous steps of the current job in sysjobhistory and looks for failure statuses.

In Properties->Advanced you can also check Include step output in history to get the message from step failure. Leave the On failure action to Quit the job reporting failure.

like image 114
wqw Avatar answered Sep 16 '22 18:09

wqw


@wqw 's accepted answer is excellent.

I've extended it for those who have Database Mail enabled to email a bit more detail about exactly what failed and how. Also incorporates icvader's answer on this page to take account of retries.

Should be really helpful for those of us who need more detail to judge whether urgent action is required when offsite/on-call.

DECLARE 

@YourRecipients as varchar(1000) = '[email protected]'
,@YourMailProfileName as varchar(255) = 'Database Mail'

,@Msg as varchar(1000)
,@NumofFails as smallint
,@JobName as varchar(1000)
,@Subj as varchar(1000)
,@i as smallint = 1


---------------Fetch List of Step Errors------------
SELECT *
INTO #Errs

FROM

    (
    SELECT 
      rank() over (PARTITION BY step_id ORDER BY step_id) rn
    , ROW_NUMBER() over (partition by step_id order by run_date desc, run_time desc) ReverseTryOrder
    ,j.name job_name
    ,run_status
    , step_id
    , step_name
    , [message]

    FROM    msdb.dbo.sysjobhistory h
    join msdb.dbo.sysjobs j on j.job_id = h.job_id

    WHERE   instance_id > COALESCE((SELECT MAX(instance_id) FROM msdb.dbo.sysjobhistory
                                    WHERE job_id = $(ESCAPE_SQUOTE(JOBID)) AND step_id = 0), 0)
            AND h.job_id = $(ESCAPE_SQUOTE(JOBID))
    ) as agg

WHERE ReverseTryOrder = 1 ---Pick the last retry attempt of each step
  AND run_status <> 1 -- show only those that didn't succeed 


SET @NumofFails = ISNULL(@@ROWCOUNT,0)---Stored here because we'll still need the rowcount after it's reset.


-------------------------If there are any failures assemble email and send ------------------------------------------------
IF  @NumofFails <> 0
    BEGIN

        DECLARE @PluralS as char(1) = CASE WHEN @NumofFails > 1 THEN 's' ELSE '' END ---To make it look like a computer knows English
        SELECT top 1 @Subj = 'Job: ' + job_name + ' had ' + CAST(@NumofFails as varchar(3)) + ' step' + @PluralS + ' that failed'
                    ,@Msg =  'The trouble is... ' +CHAR(13) + CHAR(10)+CHAR(13) + CHAR(10)

                        FROM dbo.#Errs


        WHILE @i <= @NumofFails 
        BEGIN
            SELECT @Msg = @Msg + 'Step:' + CAST(step_id as varchar(3)) + ': ' + step_name  +CHAR(13) + CHAR(10)

            + [message] +CHAR(13) + CHAR(10)+CHAR(13) + CHAR(10) FROM dbo.#Errs
            WHERE rn = @i


            SET @i = @i + 1
        END

            exec msdb.dbo.sp_send_dbmail
            @recipients = @YourRecipients,
            @subject = @Subj,
            @profile_name = @YourMailProfileName,
            @body = @Msg


    END

One difference from the other answers on which its based: Doesn't raise the whole job as an error. That's to retain the distinction in job history between Aborted and Completed with Errors.

like image 36
Adamantish Avatar answered Sep 17 '22 18:09

Adamantish