Question

The Scenario

There are certain SQL Agent Jobs that are scheduled to run every few minutes throughout the day.

There are legitimate times when it will miss its next schedule because it's still running from the previous schedule.

Every once and a while, a job might 'hang'. This doesn't produce a failure (since the job hasn't stopped yet). When this happens, the job can be manually stopped and works fine the next time it runs. It's designed to pick back up where it left off.

What's the most efficient way...?

I'd like a way to determine how long (in seconds) a SQL Agent Job named 'JobX' is currently running. If it isn't currently running, we can just return zero.

This way, I can stop the job if it has been running for an amount of time beyond a certain threshold.

I assume that a combination of xp_sqlagent_enum_jobs and sysjobhistory could be used, but I'm curious if there are better solutions out there... and can hopefully benefit from the obstacles the rest of you have already run into and worked around.

Was it helpful?

Solution

This solution would work:

SELECT DATEDIFF(SECOND,aj.start_execution_date,GetDate()) AS Seconds
FROM msdb..sysjobactivity aj
JOIN msdb..sysjobs sj on sj.job_id = aj.job_id
WHERE aj.stop_execution_date IS NULL -- job hasn't stopped running
AND aj.start_execution_date IS NOT NULL -- job is currently running
AND sj.name = 'JobX'
and not exists( -- make sure this is the most recent run
    select 1
    from msdb..sysjobactivity new
    where new.job_id = aj.job_id
    and new.start_execution_date > aj.start_execution_date
)

This a more general check dependent on system tables. If you'd prefer a custom route, you could have the job insert into a job log table you created instead.

OTHER TIPS

/**** FOR CURRENTLY RUNNING JOBS ****/
SELECT j.name AS Job_Name,DATEDIFF(ss,a.start_execution_date ,GETDATE ())   
FROM msdb.dbo.sysjobactivity a INNER JOIN msdb.dbo.sysjobs j 
ON a.job_id =j.job_id
WHERE CONVERT(DATE,a.start_execution_date )=CONVERT(DATE,GETDATE ())
AND a.stop_execution_date IS NULL


/*This code will give u the Name of currently running jobs and for how much time it is running & after that u can add filters to it as u wish*/
/*Thanks in advance*/

What you're doing manually sounds like what's known as a "watch dog loop". Basically, an SQL job that starts and/or monitors agent jobs, killing them if need be.

The code below was taken from here, and should help, removing the need for you to manually monitor and kill jobs if they've been running for a lengthy period of time:

/****************************************************************
--This SQL will take a list of SQL Agent jobs (names must match),
--start them so they're all running together, and then
--monitor them, not quitting until all jobs have completed.
--
--In essence, it's an SQL "watchdog" loop to start and monitor SQL Agent Jobs
--
--Code from http://cc.davelozinski.com/code/sql-watchdog-loop-start-monitor-sql-agent-jobs
--
****************************************************************/
SET NOCOUNT ON 

-------- BEGIN ITEMS THAT NEED TO BE CONFIGURED --------

--The amount of time to wait before checking again 
--to see if the jobs are still running.
--Should be in hh:mm:ss format. 
DECLARE @WaitDelay VARCHAR(8) = '00:00:20'

--Job timeout. Eg, if the jobs are running longer than this, kill them.
DECLARE @TimeoutMinutes INT = 240

DECLARE @JobsToRunTable TABLE
(
    JobName NVARCHAR(128) NOT NULL,
    JobID UNIQUEIDENTIFIER NULL,
    Running INT NULL
)

--Insert the names of the SQL jobs here. Last two values should always be NULL at this point.
--Names need to match exactly, so best to copy/paste from the SQL Server Agent job name.
INSERT INTO @JobsToRunTable (JobName, JobID, Running) VALUES ('NameOfFirstSQLAgentJobToRun',NULL,NULL)
INSERT INTO @JobsToRunTable (JobName, JobID, Running) VALUES ('NameOfSecondSQLAgentJobToRun',NULL,NULL)
INSERT INTO @JobsToRunTable (JobName, JobID, Running) VALUES ('NameOfXSQLAgentJobToRun',NULL,NULL)

-------- NOTHING FROM HERE DOWN SHOULD NEED TO BE CONFIGURED --------

DECLARE @ExecutionStatusTable TABLE
(
    JobID UNIQUEIDENTIFIER PRIMARY KEY, -- Job ID which will be a guid
    LastRunDate INT, LastRunTime INT, -- Last run date and time
    NextRunDate INT, NextRunTime INT, -- Next run date and time
    NextRunScheduleID INT, -- an internal schedule id
    RequestedToRun INT, RequestSource INT, RequestSourceID VARCHAR(128),
    Running INT,    -- 0 or 1, 1 means the job is executing
    CurrentStep INT, -- which step is running
    CurrentRetryAttempt INT, -- retry attempt
    JobState INT -- 0 = Not idle or suspended, 1 = Executing, 2 = Waiting For Thread,
                     -- 3 = Between Retries, 4 = Idle, 5 = Suspended, 
                     -- 6 = WaitingForStepToFinish, 7 = PerformingCompletionActions
)

DECLARE @JobNameToRun NVARCHAR(128) = NULL
DECLARE @IsJobRunning BIT = 1
DECLARE @AreJobsRunning BIT = 1
DECLARE @job_owner sysname = SUSER_SNAME()
DECLARE @JobID UNIQUEIDENTIFIER = null
DECLARE @StartDateTime DATETIME = GETDATE()
DECLARE @CurrentDateTime DATETIME = null
DECLARE @ExecutionStatus INT = 0
DECLARE @MaxTimeExceeded BIT = 0

--Loop through and start every job
DECLARE dbCursor CURSOR FOR SELECT JobName FROM @JobsToRunTable
OPEN dbCursor FETCH NEXT FROM dbCursor INTO @JobNameToRun
WHILE @@FETCH_STATUS = 0
BEGIN
    EXEC [msdb].[dbo].sp_start_job @JobNameToRun
    FETCH NEXT FROM dbCursor INTO @JobNameToRun
END
CLOSE dbCursor
DEALLOCATE dbCursor

print '*****************************************************************'
print 'Jobs started. ' + CAST(@StartDateTime as varchar)
print '*****************************************************************'

--Debug (if needed)
--SELECT * FROM @JobsToRunTable

WHILE 1=1 AND @AreJobsRunning = 1
BEGIN

    --This has to be first with the delay to make sure the jobs
    --have time to actually start up and are recognized as 'running'
    WAITFOR DELAY @WaitDelay 

    --Reset for each loop iteration
    SET @AreJobsRunning = 0

    --Get the currently executing jobs by our user name
    INSERT INTO @ExecutionStatusTable
    EXECUTE [master].[dbo].xp_sqlagent_enum_jobs 1, @job_owner

    --Debug (if needed)
    --SELECT 'ExecutionStatusTable', * FROM @ExecutionStatusTable

    --select every job to see if it's running
    DECLARE dbCursor CURSOR FOR 
        SELECT x.[Running], x.[JobID], sj.name 
        FROM @ExecutionStatusTable x 
        INNER JOIN [msdb].[dbo].sysjobs sj ON sj.job_id = x.JobID
        INNER JOIN @JobsToRunTable jtr on sj.name = jtr.JobName
    OPEN dbCursor FETCH NEXT FROM dbCursor INTO @IsJobRunning, @JobID, @JobNameToRun

    --Debug (if needed)
    --SELECT x.[Running], x.[JobID], sj.name 
    --  FROM @ExecutionStatusTable x 
    --  INNER JOIN msdb.dbo.sysjobs sj ON sj.job_id = x.JobID
    --  INNER JOIN @JobsToRunTable jtr on sj.name = jtr.JobName

    WHILE @@FETCH_STATUS = 0
    BEGIN
        --bitwise operation to see if the loop should continue
        SET @AreJobsRunning = @AreJobsRunning | @IsJobRunning

        UPDATE @JobsToRunTable
        SET Running = @IsJobRunning, JobID = @JobID
        WHERE JobName = @JobNameToRun

        --Debug (if needed)
        --SELECT 'JobsToRun', * FROM @JobsToRunTable

        SET @CurrentDateTime=GETDATE()

        IF @IsJobRunning = 1
        BEGIN -- Job is running or finishing (not idle)

            IF DATEDIFF(mi, @StartDateTime, @CurrentDateTime) > @TimeoutMinutes
            BEGIN     
                print '*****************************************************************'
                print @JobNameToRun + ' exceeded timeout limit of ' + @TimeoutMinutes + ' minutes. Stopping.'
                --Stop the job
                EXEC [msdb].[dbo].sp_stop_job @job_name = @JobNameToRun
            END
            ELSE
            BEGIN
                print @JobNameToRun + ' running for ' + CONVERT(VARCHAR(25),DATEDIFF(mi, @StartDateTime, @CurrentDateTime)) + ' minute(s).'
            END
        END

        IF @IsJobRunning = 0 
        BEGIN
            --Job isn't running
            print '*****************************************************************'
            print @JobNameToRun + ' completed or did not run. ' + CAST(@CurrentDateTime as VARCHAR)
        END

        FETCH NEXT FROM dbCursor INTO @IsJobRunning, @JobID, @JobNameToRun

    END -- WHILE @@FETCH_STATUS = 0
    CLOSE dbCursor
    DEALLOCATE dbCursor

    --Clear out the table for the next loop iteration
    DELETE FROM @ExecutionStatusTable

    print '*****************************************************************'

END -- WHILE 1=1 AND @AreJobsRunning = 1

SET @CurrentDateTime = GETDATE()
print 'Finished at ' + CAST(@CurrentDateTime as varchar)
print CONVERT(VARCHAR(25),DATEDIFF(mi, @StartDateTime, @CurrentDateTime)) + ' minutes total run time.'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top