Last updated on February 23rd, 2019 at 07:53 am
Introduction To Latch in Oracle Database
Latches are low level serialization mechanisms used to protect shared data structures in the Oracle SGA. The implementation of latches is operating system dependent, particularly in regard to whether a process will wait for a latch and for how long.
A latch is a type of a lock that can be very quickly acquired and freed. Latches are typically used to prevent more than one process from executing the same piece of code at a given time.
Associated with each latch is a cleanup procedure that will be called if a process dies while holding the latch. Latches have an associated level that is used to prevent deadlocks. Once a process acquires a latch at a certain level it cannot subsequently acquire a latch at a level that is equal to or less than that level (unless it acquires it no-wait).
Oracle latch is actually a spin mutexes on process level which used in most cases to guard the SGA fixed and heap based data structures from being changed every time by more than just a single process
A process acquires a latch when working with a structure in the SGA (System Global Area). It continues to hold the latch for the period of time it works with the structure. The latch is dropped when the process is finished with the structure. Each latch protects a different set of data, identified by the name of the latch.
Oracle uses atomic instructions like “test and set” for operating on latches. Processes waiting to execute a part of code for which a latch has already been obtained by some other process will wait until the latch is released. Examples are redo allocation latches, copy latches, archive control latch etc.
The basic idea is to block concurrent access to shared data structures. Since the instructions to set and free latches are atomic, the OS guarantees that only one process gets it. Since it is only one instruction, it is quite fast. Latches are held for short periods of time and provide a mechanism for cleanup in case a holder dies abnormally while holding it. This cleaning is done using the services of PMON.
Latches request can be made in two modes: “willing-to-wait” or “no wait”.
Normally, latches will be requested in “willing-to-wait” mode. A request in “willing-to-wait” mode will loop, wait, and request again until the latch is obtained.
In “no wait” mode the process request the latch. If one is not available, instead of waiting, another one is requested. Only when all fail does the server process have to wait.
Examples of “willing-to-wait” latches are: shared pool and library cache latches
A example of “no wait” latches is the redo copy latch.
Difference between Latches and Enqueues
Latch present a contrast with enqueues which always protect structures which has some disk based equivalent such as database objects, data blocks or table rows.
Another significant difference between latch and enqueue In latches there is no ordered queue of waiters like in enqueues. Latch waiters may either use timers to wake up and retry or spin (only in multiprocessors). Since all waiters are concurrently retrying (depending on the scheduler), anyone might get the latch and conceivably the first one to try might be the last one to get.
If a required latch is busy, the process requesting it spins, tries again and if still not available, spins again. The loop is repeated up to a maximum number of times determined by the initialization parameter _SPIN_COUNT.
If after this entire loop, the latch is still not available, the process must yield the CPU and go to sleep. Initially is sleeps for one centisecond. This time is doubled in every subsequent sleep. This causes a slowdown to occur and results in additional CPU usage, until a latch is available. The CPU usage is a consequence of the “spinning” of the process. “Spinning” means that the process continues to look for the availability of the latch after certain intervals of time,
during which it sleeps.
More information on latches
Each row in the V$LATCH table contains statistics for a different type of latch. The columns of the table reflect activity for different types of latch requests. The distinction between these types of requests is whether the requesting process continues to request a latch if it is unavailable:
V$LATCHNAME key information:
GETS Number of successful willing-to-wait requests for
MISSES Number of times an initial willing-to-wait request
SLEEPS Number of times a process waited a requested a latch
after an initial wiling-to-wait request.
IMMEDIATE_GETS Number of successful immediate requests for each latch.
IMMEDIATE_MISSES Number of unsuccessful immediate requests for each latch.
Calculating latch hit ratio
To get the Hit ratio for latches apply the following formula:
“willing-to-wait” Hit Ratio=(GETS-MISSES)/GETS
“no wait” Hit Ratio=(IMMEDIATE_GETS-IMMEDIATE_MISSES)/IMMEDIATE_GETS
This number should be close to 1. If not, tune according to the latch name
Useful SQL scripts to get latch information
-Display System-wide latch statistics.
column name format A32 truncate heading "LATCH NAME" column pid heading "HOLDER PID" select c.name,a.addr,a.gets,a.misses,a.sleeps, a.immediate_gets,a.immediate_misses,b.pid from v$latch a, v$latchholder b, v$latchname c where a.addr = b.laddr(+) and a.latch# = c.latch# order by a.latch#;
-Given a latch address, find out the latch name.
column name format a64 heading 'Name' select name from v$latchname a, v$latch b where b.addr = '&addr' and b.latch#=a.latch#;
-Display latch statistics by latch name.
column name format a32 heading 'LATCH NAME' column pid heading 'HOLDER PID' select c.name,a.addr,a.gets,a.misses,a.sleeps, a.immediate_gets,a.immediate_misses,b.pid from v$latch a, v$latchholder b, v$latchname c where a.addr = b.laddr(+) and a.latch# = c.latch# and c.name like '&latch_name%' order by a.latch#;
List of all the latches
Oracle versions might differ in the latch# assigned to the existing latches.
The following query will help you to identify all latches and the number assigned.
column name format a40 heading 'LATCH NAME' select latch#, name from v$latchname;
SPIN_COUNT controls how many times the process will re-try to obtain the latch before backing off and going to sleep. This basically means the process is in a tight CPU loop continually trying to get the latch for SPIN_COUNT attempts. On a single CPU system if an Oracle process tries to acquire a latch but it is held by someone else the process will release the CPU and go to sleep for a short period before trying again. However, on a multi processor system (SMP) it is possible that the
process holding the latch is running on one of the other CPUs and so will potentially release the latch in the next few instructions (latches are usually held for only very short periods of time). Performance can be adjusted by changing the value of SPIN_COUNT. If a high value is used, the latch will be attained sooner than if
you use a low value. However, you may use more CPU time spinning to get the latch if you use a high value for SPIN_COUNT.
INTRODUCTION TO MUTEX in Oracle
A mutex is smaller than a latch .The code for a latch is around 110 bytes in size (which contains anywhere between 150 and 200 instructions) while the code for a mutex is a mere 28 bytes in size , making a mutex faster to execute.
As it is shorter,Mutex does not contain data on who is waiting or about the length of the wait, it won’t have information on requests and misses but there is sleep data available (how many, how long) and a couple of other details that can help monitor them.
It provides below benefits
Less potential for false contention.
there are only 67 library cache latches available to protect 131,072 buckets created by default in library cache. So single latch covers multiple buckets. This creates a false contention, meaning if 2 process are trying to access 2 different buckets protected by same latch, one of them have to wait until other completes traversing its bucket. So even though they are not colliding on same bucket they still will be blocked on each other because latch mechanism
As against 67 latches to protect 131,072 buckets, Oracle introduced 131,072 mutexes to protect each bucket. So false contention is reduced drastically. False contention can still occur if 2 process want to access to 2 different objects which belongs to same bucket but its very rare.
Replace both latches and pins.
A mutex can act as a serialization mechanism (just like a latch, restricting access to a single session) and also as a pin (to prevent an object from ‘aging out’ of the relevant cache). A mutex has both an exclusive and a shared mode, and in shared mode multiple sessions can access (using the term ‘reference’) the same mutex and the number of sessions currently referencing a given mutex is known as the ‘reference count’. That information is stored in the mutex. And, like a latch, a mutex can also be held in exclusive mode and the reference count of 1 could indicate an exclusive hold on that mutex. Additionally, an object cannot be aged out of a cache until the mutex reference count is 0.
Mutex structure is located in each child cursor.
Oracle provides below two views to monitor mutex
SQL> desc v$mutex_sleep Name Null? Type ----------------------------------------- -------- ---------------------------- MUTEX_TYPE VARCHAR2(32) LOCATION VARCHAR2(40) SLEEPS NUMBER WAIT_TIME NUMBER CON_ID NUMBER SQL> desc v$mutex_sleep_history Name Null? Type ----------------------------------------- -------- ---------------------------- MUTEX_IDENTIFIER NUMBER SLEEP_TIMESTAMP TIMESTAMP(6) MUTEX_TYPE VARCHAR2(32) GETS NUMBER SLEEPS NUMBER REQUESTING_SESSION NUMBER BLOCKING_SESSION NUMBER LOCATION VARCHAR2(40) MUTEX_VALUE RAW(8) P1 NUMBER P1RAW RAW(8) P2 NUMBER P3 NUMBER P4 NUMBER P5 VARCHAR2(64) CON_ID NUMBER
As of release 11.2.0.x the following latches have been replaced by mutexes:
library cache pin allocation
library cache lock allocation
library cache hash chains
library cache lock
library cache pin
The library cache waits listed below no longer exist in releases 11.2.0.x and later:
latch: library cache
latch: library cache lock
latch: library cache pin