Voting Disks in Oracle RAC Description
Voting disk is a file that manages information about node membership.It is located on the shared cluster system or a shared raw device file or Oracle ASM(11gr2 onwards) .Its primary purpose is to help in a situation where the private network communication fails.CSS (Cluster Synchronization Service) is the service that determines which nodes in the cluster are available via communication through a dedicated private network and with a voting disk used as a secondary communication mechanism. CSS service is sending heartbeat messages through network and voting disk.
In a situation, when due to private network failure, nodes are not able to synchronize I/O to the shared disks, Therefore some of the nodes will be go to offline stage. At this time VOTING DISK used to communicate the node and trace information, which nodes is in offline stage.Without the voting disk, it’s difficult to know, whether nodes are facing network problem or nodes are no longer available.
If we are not using voting disk, then due to network failure, nodes are neither able to communicate with each other nor synchronize with database, this situation is called as cluster SPLIT-BRAIN problem.When any node is not able to send heartbeat to voting disk, then it will reboot itself.
For high availability, Oracle recommends that you have a minimum of three voting disks. If you configure a single voting disk,then you should use external mirroring to provide redundancy.
Important points about Voting disks in Oracle RAC
These files can be stored either in ASM or on shared storage.
1)If it is stored in ASM, no need to configure manually as the files will be created depending on the redundancy in ASM.
2) In shared storage system, we need to manually configure these files with redundancy setup for high availability.
a) We must have odd number of disks.
b)Oracle recommends minimum of 3 and maximum of 5. In 10g, Oracle Clusterware can supports 32 voting disks but in 11g R 2 supports 15 voting disks.
c)A node must be able to access more than half of the voting disks at any time. For eg, if you have 5 voting disks, a node must be able access at-least 3 of the voting disks. If it cannot access the minimum of voting disks, then it is evicted/removed from the cluster.
d) All nodes in the RAC cluster register their heartbeat information in the voting disks/files. RAC heartbeat is the polling mechanism that is sent over the cluster interconnect to ensure all RAC nodes are available.
What is NETWORK and DISK HEARTBEAT and how it registers in VOTING DISKS/FILES
- All nodes in the RAC cluster register their heartbeat information in the voting disks/files. RAC heartbeat is the polling mechanism that is sent over the cluster interconnect to ensure all RAC
a. nodes are available.
b. Voting disks/files are just like attendance register where you have nodes mark their attendance (heartbeats).
2) CSSD process on every node makes entries in the voting disk to ascertain the membership of the node. While marking their own presence, all the nodes also register the information about their communicability with other nodes in the voting disk. This is called NETWORK HEARTBEAT.
3) CSSD process in each RAC maintains the heart beat in a block of size 1 OS block in the hot block of voting disk at a specific offset. The written block has a header area with the node name. The heartbeat counter increments every second on every write call. Thus heartbeat of various nodes is recorded at different offsets in the voting disk. This process is called DISK HEARTBEAT.In addition of maintaining its own disk block, CSSD processes also monitors the disk block maintained by the CSSD processes of other nodes in cluster. Healthy nodes will have continuous network & disk heartbeats exchanged between the nodes. Break in heartbeats indicates a possible error scenario.
4) If the disk is not updated in a short timeout period, the node is considered unhealthy and may be rebooted to protect the database. In this case, a message to this effect is written in the KILL BLOCK of node. Each nodes reads its KILL BLOCK once per second, if the kill block is not overwritten, node commits suicide.
How Voting disks in RAC help
CSSD processes (Cluster Services Synchronization Daemon) monitor the health of RAC nodes employing two distinct heart beats: Network heart beat and Disk heart beat. Healthy nodes will have continuous network and disk heartbeats exchanged between the nodes. Break in heart beat indicates a possible error scenario. There are few different scenarios possible with missing heart beats:
1. Network heart beat is successful, but disk heart beat is missed.
2. Disk heart beat is successful, but network heart beat is missed.
3. Both heart beats failed.
In addition, with numerous nodes, there are other possible scenarios too. Few possible scenarios:
1. Nodes have split in to N sets of nodes, communicating within the set, but not with members in other set.
2. Just one node is unhealthy.Nodes with quorum will maintain active membership of the cluster and other node(s) will be fenced/rebooted.
CSSD is a mutithreaded process.So it use various thread to monitor the heart beat
What is stored in voting disks in RAC?
Voting disks contain static and dynamic data.
Static data : Info about nodes in the cluster
Dynamic data : Disk heartbeat logging
It maintains and consists of important details about the cluster nodes membership, such as
– which node is part of the cluster,
– who (node) is joining the cluster, and
– who (node) is leaving the cluster.
Important Operation for Voting disks in RAC
a) To list currently configured voting disk
$ORA_CRS_HOME/bin/crsctl query css votedisk
b) Adding or deleting the Voting disks
crsctl add css votedisk
Run the following command as the root user to remove a voting disk:
crsctl delete css votedisk
c) replacing Voting disk
crsctl replace votedisk
Backup voting disks in RAC :
Run the following command to back up a voting disk. Perform this operation on every voting disk as needed where voting_disk_name is the name of the active voting disk and backup_file_name is the name of the file to which you want to back up the voting disk contents:
dd if=voting_disk_name of=backup_file_name $ dd if=[votedisk1] of=/home/oracle/vote/vote.dmp bs=4k 1675289+1 records in 1675289+1records out ( vote.dmp is the name of backup file of voting disk)
Recovering Voting Disks in RAC
Run the following command to recover a voting disk where backup_file_name is the name of the voting disk backup file and voting_disk_name is the name of the active voting disk:
dd if=backup_file_name of=voting_disk_name [oracle@rac1 bin]$ dd if=/home/oracle/vote/vote.dmp of=[votedisk1] bs=4k 1675289+1 records in 1675289+1records out
Hope you like the information on voting disks in RAC. Please do provide the feedback
What is Oracle Clusterware?
How to Recreate Central Inventory in Real Applications Clusters
How to add any node to Oracle RAC cluster in 10g and 11g
How to setup diag wait in cluster
Cluster command in Oracle clusterware 10g and 11g
Oracle Flex Cluster 12c