Sunday, April 29, 2007

Killing Long Running Jobs

Hi all,

I have often seen DBA's bouncing their database to get rid of stubborn jobs. In this post, I describe a way to remove the long running jobs without bouncing your database.

First, we need to identify the long running jobs, once we know what jobs are stubborn, we will mark these jobs as 'BROKEN'. We then kill the sessions linked with the jobs, and optionally kill the OS processes associated with these sessions. Next, we mark the jobs back as 'UNBROKEN' and finally run the job.

Note: Before marking the jobs as Broken or killing the session, run the following SQL’s and store the output of each query.

1. Find all the Long Running Jobs

Use the following query to get a detailed information on long running jobs

select djr.Sid,
dj.Log_User,
dj.Job,
dj.Broken,
dj.Failures,
dj.Last_Date,
dj.Last_Sec,
dj.This_Date,
dj.This_Sec,
dj.Next_Date,
dj.Next_Sec,
dj.Interval,
dj.What
from Dba_Jobs dj,
Dba_Jobs_Running djr
where dj.job = djr.job;

2. Mark the Jobs as "BROKEN'

Using the below query, I create a script to mark all the long running jobs as broken.

select ‘Exec DBMS_JOB.BROKEN(‘j.job ‘,TRUE);’
from dba_jobs dj,
dba_jobs_running djr
where dj.job = djr.job;

3. Kill Oracle Sessions

Using the below SQL query, you can kill all the sessions associated with the long running jobs.

select 'ALTER SYSTEM KILL SESSION ''' s.sid','s.serial#''';'
from dba_jobs dj, dba_jobs_running djr, v$process p , v$session s
where dj.job = djr.job
and p.addr = s.paddr
and djr.sid = s.sid;

4. Kill the OS Sessions

Optionally, you may kill the OS sessions to speedup PMON in cleaning the killed session undo data.

select 'Kill -9 ' p.spid
from dba_jobs dj, dba_jobs_running djr, v$process p , v$session s
where dj.job = djr.job
and p.addr = s.paddr
and djr.sid = s.sid;

5. Mark the jobs as Unbroken

Wait till all the sessions marked as ‘KILLED’ have been really killed and disappear from the V$SESSION view. Next, mark all the jobs as unbroken using the below query.

select ‘Exec DBMS_JOB.BROKEN(‘||j.job|| ‘,FALSE);’
           from dba_jobs dj, 
                  dba_jobs_running djr
         where dj.job = djr.job; 

6. Manually Run the jobs

Now that all the long running jobs have disappeared and its time to put the jobs back in the job queue.

select ‘Exec DBMS_JOB.RUN(‘||j.job|| ‘);’
           from dba_jobs dj, 
                  dba_jobs_running djr
         where dj.job = djr.job; 

7. Query ALL_JOBS views

Query the ALL_JOBS view to see that all the jobs are back in motion.

This way you will put the jobs back in trace without the need to recycle the database.

Happy reading !!!

4 comments:

  1. Step 5 has the wrong sql ( copy of step 4 )

    ReplyDelete
  2. @Nick,

    Thanks. I have made corrections.

    ReplyDelete
  3. 3. Kill Oracle Sessions

    You can use the IMMEDIATE option which kill session at oracle level.

    ReplyDelete
  4. Excellent post..........
    i have also posted similar post :

    http://chandu208.blogspot.com/2011/08/find-kill-session.html

    ReplyDelete