Home » Oracle » How ASM Failure Groups and CSS provides high availability

How ASM Failure Groups and CSS provides high availability

ASM Failure Groups and Redundancy
1)For systems that do not use external redundancy, ASM provides its own internal redundancy mechanism
and additional high availability by way of ASM failure groups
2) It is also collection of disk and is considered as subset of disk group.
3)Disk group redundancy can be
Normal – It is Two-way mirroring requiring at least two failure groups(default)
High – It is high degree Three-way mirroring requiring at least three failure groups

Disk Group Type Supported Mirroring Levels Default Mirroring Level
External redundancy Unprotected (None)

 

Unprotected (None)

 

Normal redundancy

 

Two-way

Three-way

Unprotected (None)

 

Two-way
High redundancy Three way Three way

4)Once a Disk group is created, its redundancy cannot be changed. The only way to change he redundancy is to create new Disk group with required redundancy and move the datafiles on it using RMAN restore or using DBMS_FILE_TRANSFER
5) ASM does not mirror disks but it mirror extents. When ASM allocates primary extent (first extent) of a file to one disk in DG, its mirror copy of that extent to another disk in DG

What is ASM Failure Groups

Failure groups are used to store mirror copies of data. When ASM allocates an extent for a normal redundancy file, ASM allocates a primary copy and a secondary copy. ASM chooses the disk on which to store the secondary copy so that it is in a different failure group than the primary copy. Each copy is on a disk in a different failure group so that the simultaneous failure of all disks in a failure group does not result in data loss.

A failure group is a subset of the disks in a disk group, which could fail at the same time because they share hardware. The failure of common hardware must be tolerated.

There are always failure groups even if they are not explicitly created. If you do not specify a failure group for a disk, then Oracle automatically creates a new failure group containing just that disk. A normal redundancy disk group must contain at least two failure groups. A high redundancy disk group must contain at least three failure groups. However, Oracle recommends using several failure groups. A small number of failure groups, or failure groups of uneven capacity, can create allocation problems that prevent full use of all of the available storage.

ASM Fast Disk Resync
Disk loss in ASM can result from a number of reasons, such as loss of controller cards, cable failures, or power-supply errors. In many cases, the disk itself is still intact. To allow for sufficient time to recover from disk failures that do not involve the actual failure of a disk, ASM provides the ASM fast disk resync feature.
By default, when a disk in an ASM disk group fails the disk will be taken offline automatically. The disk will be dropped some 3.6 hours later. As a result, you have only 3.6 hours by default to respond to a disk outage. If you correct the problem and the physical disk media is not corrupted, then ASM fast disk resync will quickly re-synchronize the disk when it comes back online, correcting the problem very quickly.You can change the amount of time that Oracle will wait to automatically drop the disk by setting the disk_repair_time attribute for the individual disk groups using the alter diskgroup command, as shown in this example, where we set the disk_repair_time attribute to 18 hours:
SQL> Alter diskgroup dgroup1 set attribute ‘disk_repair_time’=’10h’;
ASM Preferred Mirror Read

Your ASM configuration may involve remote mirroring to disks that are a fair distance away. When some of your disk mirrors are far away then those disks may not be the best set of disks for a given instance to read from. For example, you might have a Real Application Cluster database with local and remote mirrored disks. In this case, you want to have the RAC instances primarily read from the local disks to ensure the best performance. ASM preferred mirror read is designed to indicate to Oracle which disk fail-group is the preferred read disk group.
ASM preferred mirror read is only available if you are using RAC. Also preferred mirror read is generally used only with clustered ASM instances, but this is not a requirement. To take advantage of ASM preferred mirror read, you should configure each disk failure group with specific geographically located set of disks. Use the Oracle 11g parameter, asm_preferred_read_failure_groups, to configure a database instance with a list of preferred disk failure group names to use when that instance accesses ASM disks. The format of the values of the asm_preferred_read_failure_groups parameter is diskgroup name.failure group name where diskgroup name is the name of the disk group that the failure group belongs to and failure group name is the preferred failure groups name.

In the event ASM cannot read from the preferred disk failure group, then the non-preferred failure groups will be read. To determine if a given disk file group is a preferred disk group you can use the PREFERRED_READ column of the V$ASM_DISK view.

 

Example of failure groups

Creating diskgroups for high and normal redundancy

CREATE DISKGROUP DATA NORMAL REDUNDANCY
FAILGROUP DATA_FAILURE_group_1 DISK
'/dev/5855d' , '/dev/6476d' ,
FAILGROUP DATA_FAILURE_group_2 DISK
'/dev/5853d' , '/dev/5854d' ;

For two-way mirroring we would expect a diskgroup to contain two failure groups, so individual files are written to two locations.

CREATE DISKGROUP DATA HIGH REDUNDANCY
FAILGROUP DATA_FAILURE_group_1 DISK
'/dev/5851d', '/dev/5852d',
FAILGROUP DATA_FAILURE_group_2 DISK
'/dev/5853d', '/dev/5854d',
FAILGROUP DATA_FAILURE_group_3 DISK
'/dev/5855d', '/dev/5856d';

For three-way mirroring we would expect a diskgroup to contain three failure groups, so individual files are written to three locations.

Cluster Synchronization Services – CSS

1) CSS is important for ASM to operate
2)CSS maintains synchronization between the ASM and database instances. CSS, which is a component of Oracle’s Cluster Ready Services (CRS), is automatically installed on every node that runs Oracle Database 10g ASM and starts up automatically on server boot-up. In RAC 10g environments, the full Oracle Cluster-ware (CRS) is installed on every RAC node.
3) Since CSS provides cluster management and node monitor management, it inherently monitors ASM and
its shared storage components (disks and diskgroups). Upon startup, ASM will register itself and all
diskgroups it has mounted, with CSS. This allows CSS across all RAC nodes to keep diskgroup metadata
in-sync. Any new diskgroups that are created are also dynamically registered and broad-casted to other
nodes in the cluster.
4) As with the database, inter-node communication is used to synchronize activities in ASM instances. CSS is
used to heartbeat the health of the ASM instances. ASM inter-node messages are initiated by structural
changes that require synchronization; e.g. adding a disk. Thus, ASM uses the same integrated lock
management infrastructure that is used by the database for efficient synchronization.

When the ASM Oracle Home is changed in Single Instance ASM,then we need to reconfigure the CSS. The command below new to be executed from New Home

$ORACLE_HOME/bin/localconfig reset

Related Articles

Oracle ASM (Automatic Storage Management ) Introduction and How it works

ASM Initialization Parameters: ASM_DISKSTRING,ASM_DISKGROUPS

How Oracle ASM Rebalance works

Oracle ASM Diskgroups : Create and Alter diskgroup

Top 46 Oracle ASM Interview Questions

Oracle documentation for ASM

See also  Oracle Set Operators

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top