Troubleshooting Adpatch failed job

troubleshooting-tips_adpatch


adpatch failed many times and here i am explaining the tips on Troubleshooting Adpatch  failed job

Types of jobs and their resolutions

.odf files :

-odf files are there to create tables ,indexes and view etc.
-If encountered locking, identify the session that holds the lock and kill the session. Do not skip the job because of locking.Some times odf failed to drop the index.then you can manually drop the index and restart the job.

.pls scripts :

-If the failed pls scripts are for creating stored procedure or packages, If the failure is caused by locking, resolve the locking before restarting the worker.
-If pls to create objects runs for longer than 5 mins, check for locking.
-If you have more than 1 pls/sql scripts belong to the same products are running and they run for a while, dead-lock may happen. In that case, fail all the workers and then start 1 worker at a time until they have passed the dead-lock point.

.sql Scripts:

-If the script runs for longer than 5mins, review the script. If the script is to create or replace stored procedure, the script should have completed in a few minutes otherwise check for locking.
-If the script is data update, check the sql that makes up the main cursor or sql that selects the rows to be updated. Monitor the progress by running the sql that selects the rows to be updated. The number of rows should decrease as the script runs. . If encountered locking, identify the session that holds the lock and kill the session.

WFLOAD and FNDLOAD failure– Look for the error message in the log file of the worker that fails.
-For WFLOAD and FNDLOAD, you need to look at the log file generated by the above command. The worker’s logfile should point you to the location of the log file.
-If the error is caused by database locking, find out the sid and serial number of the process that is holding the lock.
Once you have identified the sid and serial#, try killing the database session by issuing the following command in sqlplus as system : alter system kill session ‘sid,serial#’;

If any of the workers failed due to insufficient tablespace, then add the space , restart the failed worker.
If you have more than 1 ldt scripts belong to the same products are running and they run for a while, dead-lock may happen. In that case, fail all the workers and then start 1 worker at a time until they have passed the dead-lock point.

Xml jobs failure-These jobs create tables,view,materialized view etc.
-If encountered locking, identify the session that holds the lock and kill the session. Do not skip the job because of locking.
-If you have more than 1 ldt scripts belong to the same products are running and they run for a while, dead-lock may happen. In that case, fail all the workers and then start 1 worker at a time until they have passed the dead-lock point.
– It is it due to java memory issue.try starting the workers one by one,so that less memory is used and all get completed.

Generating reports and forms :

If adpatch fails to generate reports or forms. it is ok to skip these failure as you can always regenerate them after patching.

Killing Long Runnig SQL

 

  • Find the session id and serial # of the session at which the SQL is running.
  • Using the following statement kill the SQL session

ALTER SYSTEM KILL SESSION ‘sid,serial#’;

  • The worker starts the SQL again, use the same steps above and kill again.
  • After TWO failures, adctrl will show the status of SQL as FAILED.
  • Using the option 8 and the worker number, change the status to ‘Skip & Restart’. Patch will skip this SQL and continue.


Some scripts to help us in identifying the locks

The following sql will give information on what processes that are currently accessing the object that the patch tries to update.

Select va.sid,vp.serial#
From v$access va, v$process vp
Where va.object_name=’&locked_object’
And va.sid = vp.sid
;

This script will identify what session is waiting for at that moment.If the result is enqueue,the problem is lock and identify the holder and kill it

select sid,seq#,wait_time,event,seconds_in_wait,state from v$session_wait where sid in (&sid)

Sql to kill the session

alter system kill session ‘sid,serial#’;

To see all the waiters in the database

set linesize 1000
column waiting_session heading ‘WAITING|SESSION’
column holding_session heading ‘HOLDING|SESSION’
column lock_type format a15
column mode_held format a15
column mode_requested format a15

select
waiting_session,
holding_session,
lock_type,
mode_held,
mode_requested,
lock_id1,
lock_id2
from
dba_waiters
/

To find library cache locks

select /*+ all_rows */ w1.sid waiting_session,
h1.sid holding_session,
w.kgllktype lock_or_pin,
w.kgllkhdl address,
decode(h.kgllkmod, 0, ‘None’, 1, ‘Null’, 2, ‘Share’, 3, ‘Exclusive’,
‘Unknown’) mode_held,
decode(w.kgllkreq, 0, ‘None’, 1, ‘Null’, 2, ‘Share’, 3, ‘Exclusive’,
‘Unknown’) mode_requested
from dba_kgllock w, dba_kgllock h, v$session w1, v$session h1
where
(((h.kgllkmod != 0) and (h.kgllkmod != 1)
and ((h.kgllkreq = 0) or (h.kgllkreq = 1)))
and
(((w.kgllkmod = 0) or (w.kgllkmod= 1))
and ((w.kgllkreq != 0) and (w.kgllkreq != 1))))
and w.kgllktype = h.kgllktype
and w.kgllkhdl = h.kgllkhdl
and w.kgllkuse = w1.saddr
and h.kgllkuse = h1.saddr
/

set linesize 132

column “USER” format a20
column “SID_SERIAL” format a15
column “EVENT” format a60 wrap

select
b.username || ‘(‘|| b.osuser || ‘)’ “USER”,
a.sid || ‘,’|| b.serial# “SID_SERIAL”,
a.event || ‘=>’|| a.p1text || ‘=’ || a.p1raw || ‘ ‘ ||
a.p2text || ‘=’ || a.p2 || ‘ ‘ ||
a.p3text || ‘=’ || a.p3 “EVENT”
from v$session_wait a, v$session b
where a.sid = b.sid
and a.event like ‘library%’
order by 3
;

Some imp point to note
1. Never skip the failed worker without consulting from oracle
2. Key to resolution is checking the logs file and finding out the correct errors
3. When running adpatch.it is good to monitor the database for locks and tablespace free space.
4. Workers should be choosen appropiatly…it should not chooseen very high otherwise it will slow down the adpatch process


Leave a Reply