Many times Oracle database get hangs due to locks or latching issue and Nobody can login to the system. It becomes critical and And Usually we do shutdown abort to immediately resolve the issue. But it is good to take system state dump/hanganalyze at 1 min interval to analyze the locking or latching issue afterwards before doing shutdown abort so that we can find the fix for the problem and avoid that in future.Oracle Support also usually ask for these when the database hangs for some reason
system state dump
Here are the steps to take system state dump
In RAC system this need to be executed on all the instance
Syntax for system state dump:
Logon to sqlplus as sysdba SQL> oradebug setmypid SQL> oradebug unlimit SQL> oradebug dump systemstate 266 …….. Wait at least 1 min SQL> oradebug dump systemstate 266 …….. Wait at lease 1 min SQL> oradebug dump systemstate 266 SQL> oradebug tracefile_name
With releases 11g & RAC and above You should attach to the DIAG process and change its file size limit.
The following commands will create the system state dump on each node in the diag trace file.No need to execute on all instances
sqlplus "/ as sysdba" select SPID from v$process where program like '%DIAG%'; oradebug setospid <OS_PID> -- pid of the diag process oradebug unlimit oradebug dump systemstate 266 oradebug tracefile_name exit
Hanganalyze
Similarly Hanganalyze is a good utility to analysis.
Here are the steps
Syntax for single instance
Logon to sqlplus as sysdba oradebug setmypid; oradebug unlimit; oradebug hanganalyze 3; oradebug tracefile_name
Syntax for RAC
Logon to sqlplus as sysdba oradebug setmypid; oradebug unlimit; oradebug setinst all oradebug -g def hanganalyze 3 oradebug tracefile_name
Sometimes it is not possible to even login with sqlplus “/ as sysdba ” at the time of hang, In that situation we can use below command to enter with 11g onwards
sqlplus -prelim / as sysdba
V$wait_chains
From 11gR2 onwards,oracle has provided a dynamic performance view called v$wait_chains. This also contain same information which we gather by running the hanganalyze command. So instead of using hanganalyze, you can use the below query to find wait chains
This is done using dia0 background processes starts collecting hanganalyze information and stores this in memory in the “hang analysis cache”. It does this every 3 seconds for local hanganalyze information and every 10 seconds for global (RAC) hanganalyze information
There is no gv$ equivalent as v$wait_chains would report on multiple instances in a multi-instance (RAC) environment
Some queries for this view
SQL> SELECT chain_id, num_waiters, in_wait_secs, osid, blocker_osid, substr(wait_event_text,1,30) FROM v$wait_chains;
SELECT decode( a.blocker_sid , NULL , '<chain id#' ||a.chain_id||'>' ) chain_id, RPAD( '+' , LEVEL , '-' ) ||a.sid sid, RPAD( ' ' , LEVEL , ' ' ) ||a.wait_event_text wait_event FROM V$WAIT_CHAINS a CONNECT BY PRIOR a.sid=a.blocker_sid AND PRIOR a.sess_serial#=a.blocker_sess_serial# AND PRIOR a.instance = a.blocker_instance START WITH a.blocker_is_valid='FALSE' ORDER BY a.chain_id , LEVEL /
Query for Top 100 wait chain processes
set pages 1000 set lines 120 set heading off column w_proc format a50 tru column instance format a20 tru column inst format a28 tru column wait_event format a50 tru column p1 format a16 tru column p2 format a16 tru column p3 format a15 tru column Seconds format a50 tru column sincelw format a50 tru column blocker_proc format a50 tru column fblocker_proc format a50 tru column waiters format a50 tru column chain_signature format a100 wra column blocker_chain format a100 wra SELECT * FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE, 'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)|| ' from Instance '||blocker_instance BLOCKER_PROC, 'Number of waiters: '||num_waiters waiters, 'Final Blocking Process: '||decode(p.spid,null,'', p.spid)||' from Instance '||s.final_blocking_instance FBLOCKER_PROC, 'Program: '||p.program image, 'Wait Event: ' ||wait_event_text wait_event, 'P1: '||wc.p1 p1, 'P2: '||wc.p2 p2, 'P3: '||wc.p3 p3, 'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw, 'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null, '',blocker_chain_id) blocker_chain FROM v$wait_chains wc, gv$session s, gv$session bs, gv$instance i, gv$process p WHERE wc.instance = i.instance_number (+) AND (wc.instance = s.inst_id (+) and wc.sid = s.sid (+) and wc.sess_serial# = s.serial# (+)) AND (s.final_blocking_instance = bs.inst_id (+) and s.final_blocking_session = bs.sid (+)) AND (bs.inst_id = p.inst_id (+) and bs.paddr = p.addr (+)) AND ( num_waiters > 0 OR ( blocker_osid IS NOT NULL AND in_wait_secs > 10 ) ) ORDER BY chain_id, num_waiters DESC) WHERE ROWNUM < 101;
Also Reads
Oracle DBA interview questions
oracle locks
Oracle Latches
How to check alert Log in Oracle
v$system_parameter
10046 trace
https://support.oracle.com/knowledge/Oracle%20Database%20Products/423153_1.html