What is Oracle RAC and its uses ?
Oracle RAC is a cluster database where multiple instances of Oracle runs multiple nodes sharing a single physical database and have common data & control files
Each instance has its own log files and rollback segments (UNDO Tablespace) and can simultaneously execute transactions against the single database
Caches are synchronized using Oracle Cache Fusion Technology
Since Oracle RAC has multiple instances, it provides high availability and highly scalable Solutions for Business Applications
What are the prerequisites or condition for Oracle RAC setup?
Shared Storage, Private interconnect, Oracle clusterware, Virtual IP, SCAN IP ,Multiple servers
What are the special background processes for RAC ?
LCKn : Take care of Library cache and row cache locks
LMD : Request Global enqueues and instance lock
LMSn : Perform Global cache fusion
LMON :Issues heartbeat and perform recovery
LMHB: Monitors the LMON,LMD, LMSn
ACMS: Atomic control file to memory service
GTX[i-j]:Global transaction Process
PING: Interconnect latency Measrurement process
RMSn: Oracle RAC management process
What is cache fusion?
Cache fusion is the Transfer of data blocks between RAC instances by using private network(interconnect). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster. When a block of data is requested in an instance, it first check if any instance has the block ,if the other instance has the data blocks, instance get the data block through cache fusion from another instance. If no instance is having the data block ,then it is read from the datafiles on the storage.
What is the purpose of Private Interconnect?
Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the clustered nodes. This communication is based on the TCP protocol. RAC uses the interconnect for cache fusion (UDP/RDS) and inter-process communication (TCP).
What are RAC specific Parameters?
What are the initialization parameters that must have same value for every instance in an Oracle RAC database and what should be unique ?
Same Across the Instances
Unique Across the instance
What are the wait events in RAC?
gc buffer busy
gc buffer busy acquire
gc current request
gc cr request
gc cr failure
gc current block lost
gc cr block lost
gc current block corrupt
gc cr block corrupt
gc current block busy
gc cr block busy
gc current block congested
gc cr block congested.
gc current block 2-way
gc cr block 2-way
gc current block 3-way
gc cr block 3-way
(even if we have n number of nodes, there can be only 3-way wait event)
gc current grant 2-way
gc cr grant 2-way
gc current grant busy
gc current grant congested
gc cr grant congested
gc cr multi block read
gc current multi block request
gc cr multi block request
gc cr block build time
gc current block flush time
gc cr block flush time
gc current block send time
gc cr block send time
gc current block pin time
gc domain validation
gc current retry
ges inquiry response
gcs log flush sync
What is the difference between CR(Consistent Read) block and cur (current) block?
The current block contains changes for all the committed and yet-to-be-committed transactions. A consistent read (CR) block represents a consistent snapshot of the data from a previous point in time.
What is the difference between Crash recovery and Instance recovery?
When an instance crashes in a single node database on startup a crash recovery takes place. In a RAC environment the same recovery for an instance is performed by the surviving nodes called Instance recovery.
How to kill session in Oracle RAC?
We use the same command to kill the session on the instance connected as the single node database only
alter system kill session ‘SID, SERiAL#;
If we have to kill session on another instance while connected to different instance, we can use below command
alter system kill session ‘SID, SERiAL#,@instance_id’
What is Oracle RAC One Node?
Oracle RAC One Node Option is the single instance of Oracle RAC running running in the cluster. This option provides flexiblity to consolidate multiple database in cluster providing benefits like high availablity, online rolling patch application
Oracle Clusterware Specific Questions
What are Oracle Clusterware/Daemon processes and what they do?
ocssd : It Manages cluster configuration by controlling which nodes are members of the cluster. It is managed by the process cssdagent . It is monitored through both cssdagent and cssdmonitor
crsd : Manages high availability operations within the cluster. It manage resources like database,listener, VIP.
It manages cluster resources based on the configuration information that is stored in OCR for each resource. This includes start, stop, monitor, and failover operations. T
oprocd: OPROCD is spawned by init.cssd and runs as the root user. It is used to detect hardware and driver freezes on the local node. If the node has been frozen for long enough that the other nodes have evicted it from the cluster
ctss:Cluster Time Synchronization Service. Provides time management in a cluster for Oracle Clusterware.
evmd : Event Management daemon publishes events created by
ohasd: Oracle high availability deamon. It is the first process which is started and it spawned all the process in the cluster
orarootagent: It is called Oracle Root Agent .A specialized oraagent process that helps crsd manage resources owned by root, such as the network, and the Grid virtual IP address.
oragent : Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g release 1 (11.1).
What are the Main Clusterware components?
Voting Disk – Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.
Oracle Cluster Registry (OCR) – Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster. The daemon OCSSd manages the configuration info in OCR and maintains the changes to cluster in the registry.
Virtual IP (VIP) – When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.
crsd – Cluster Resource Services Daemon. It manage the startup the resources like resources like database,listener, VIP
cssd – Cluster Synchronization Services Daemon. Manages cluster configuration by controlling which nodes are members of the cluster
evmd – Event Manager Daemon
What is the difference between SRVCTL and CRSCTL utilities?
SRVCTL(Server control( is a command-line interface that you can use to manage Oracle resources, such as databases, services, or listeners in the cluster.
It can manage server pools that have names prefixed with ora.*
CRSCTL (Oracle Clusterware Control) is a command-line tool that you can use to manage Oracle Clusterware. CRSCTL should be used for general Clusterware management and management of individual resources.
Oracle Clusterware 11g release 2 (11.2) introduces cluster-aware commands with which you can perform operations from any node in the cluster on another node in the cluster, or on all nodes in the cluster, depending on the operation.
What is OCR file?
OCR stands for Oracle cluster Registry and it store and manage information about the components that Oracle Clusterware controls, such as Oracle RAC databases, listeners, virtual IP addresses (VIPs), and services and any applications. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster. The daemon OCSSd manages the configuration info in OCR and maintains the changes to cluster in the registry.
The OCR stores configuration information in a series of key-value pairs in a tree structure. Oracle recommends doing multiplexing for the OCR files
which file has OCR location in the filesystem? or How do you identify the OCR file location?
Check /var/opt/oracle/ocr.loc or /etc/ocr.loc or /ete/oracle/ocr.loc
or u can use
How do we take backup of OCR file? and how do we check the backups available
Oracle has provided utility called ocrconfig for performing the backups of the Oracle cluster registry files.
In Oracle 10.1 and above, the OCR is backed up automatically by one instance every four hours. The previous three backup copies are retained. A backup is also
retained for each full day and at the end of each week. The frequency of backups and the number of files retained cannot be modified.
The backup are stored in $ORA_CRS_HOME/crs/cdata/crs
You can on demand backup using the below command
ocrconfig -export file_name.dmp
Also it is recommended the backup the automatic backup taken by CRS
$cp -p -R /u01/app/crs/cdata /u02/crs_backup/ocrbackup/RAC1
You can also take backup using ocrdump
ocrdump -backupfile my_file
$cp -p -R ORA_CRS_HOME/crs/cdata/crs $BACKUP_LOCATION
You can check the backup by using the below command
How to recover OCR file?
Stop Oracle Clusterware on each node using:
crsctl stop crs
Restore the backup file using
ocrconfig -restore $ORA_CRS_HOME/cdata/crs/backup00.ocr
Start Oracle Clusterware on each node using:
crsctl start crs
This is can also be done by export dump also
ocrconfig -import file_name.dmp
What is local OCR (OLR)?
What is Voting file/disk and how many files should be there?
Oracle Clusterware uses the voting disk to determine which instances are members of a cluster. Voting disk is akin to the quorum disk, which helps to avoid the split-brain syndrome. Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on the shared cluster system or a shared raw device file.
Number of voting files must be odd i.e. 1, 3, 5, 7, 9, 11, 13, 15(max) etc.
If you configure voting disks on Oracle ASM, then you do not need to manually configure the voting disks. Depending on the redundancy of your disk group, an appropriate number of voting disks are created.
How to take backup of voting file?
Run the following command to back up a voting disk. Perform this operation on every voting disk as needed where voting_disk_name is the name of the active voting disk and backup_file_name is the name of the file to which you want to back up the voting disk contents:
dd if=voting_disk_name of=backup_file_name
dd if=/u/app/oracle/oradata/votingdisk-1 of=/u/oracle/backup/vote/vote1.dmp bs=4k
17646+1 records in
17646+1 records out
How to recover the voting disk?
Run the following command to recover a voting disk where backup_file_name is the name of the voting disk backup file and voting_disk_name is the name of the active voting disk:
dd if=backup_file_name of=voting_disk_name
dd if=/u/oracle/backup/vote/vote1.dmp of=u/app/oracle/oradata/votingdisk-1 bs=4k
17646+1 records in
17646+1 records out
How do I identify the voting disk location?
crsctl query css votedisk
How to change the Voting Disk Configuration after Installing Real Application Clusters
You can dynamically add and remove voting disks after installing Real Application Clusters.Run the following command as the root user to add a voting disk:
crsctl add css path
Run the following command as the root user to remove a voting disk:
crsctl delete css votedisk
Note: If oracle clusterware is down on all nodes, then use –force option.
crsctl add css votedisk -force
crsctl delete css votedisk -force
What are various IPs used in RAC? Or How may IPs we need in RAC?
Public IP, Private IP, Virtual IP, SCAN IP
What is the use of virtual IP?
When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.
Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don’t really have a good HA solution without using VIPs.
What is the use of SCAN IP (SCAN name) and will it provide load balancing?
Single Client Access Name (SCAN) is a new Oracle Real Application Clusters (RAC) 11g Release 2, feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.
How many SCAN listeners will be running?
Three SCAN listeners only.
What is FAN( Fast Application Notification )?
Applications can use Fast Application Notification (FAN) to enable rapid failure detection, balancing of connection pools after failures, and re-balancing of connection pools when failed components are repaired. The FAN process uses system events that Oracle publishes when cluster servers become unreachable or if network interfaces fail.
What is FCF( Fast Connection Failover )?
Fast Connection Failover provides high availability to FAN integrated clients, such as clients that use JDBC, OCI, or ODP.NET. If you configure the client to use fast connection failover, then the client automatically subscribes to FAN events and can react to database UP and DOWN events. In response, Oracle gives the client a connection to an active instance that provides the requested database service.
What is TAF and TAF policies?
Transparent Application Failover (TAF) – A runtime failover for high availability environments, such as Real Application Clusters and Oracle Real Application Clusters Guard, TAF refers to the failover and re-establishment of application-to-service connections. It enables client applications to automatically reconnect to the database if the connection fails, and optionally resume a SELECT statement that was in progress. This reconnect happens automatically from within the Oracle Call Interface (OCI) library.
What all components are included nodeapps?
VIP, listener, ONS, GSD
What are the uses of services? How to find out the services in cluster?
Database services (services) are logical abstractions for managing workloads in Oracle Database. Services divide workloads into mutually disjoint groupings.Applications should use the services to connect to the Oracle database. Services define rules and characteristics (unique name, workload balancing, failover options, and high availability) to control how users and applications connect to database instances.
In an Oracle Real Application Clusters (Oracle RAC) environment, a service can span one or more instances and facilitate workload balancing based on transaction performance. This provides end-to-end unattended recovery, rolling changes by workload, and full location transparency. Oracle RAC also enables you to manage several service features with Enterprise Manager, the DBCA, and the Server Control utility (SRVCTL).
how to find out the master node in Oracle RAC cluster?
To find out which is the master node, use any one of the below.
(i) olsnodes — Which ever displayed first, is the master node of the cluster.
(ii) select MASTER_NODE from v$ges_resource;
(iii) check ocssd.log file and search for “master node number”.
(iv) oclumon manage -get master (in Oracle RAC 12c)
(v) ocrconfig -manualbackup
How to know the public IPs, private IPs, VIPs in RAC?
olsnodes -n -p -i
node1-pub 1 node1-prv node1-vip
node2-pub 2 node2-prv node2-vip
What utility is used to start DB/instance? How can you shutdown single instance?
srvctl start database –d database_name
srvctl start instance –d database_name –i instance_name
How to check the cluster (all nodes) status and How to check the cluster (one node) status?
To check the viability of Cluster Synchronization Services (CSS) across nodes:
$ crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
To check the viability of Cluster Synchronization Services (CSS) on one node
$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
What is HAS (High Availability Service) and the commands?
HAS includes ASM & database instance and listeners.
crsctl check has
crsctl config has
crsctl disable has
crsctl enable has
crsctl query has releaseversion
crsctl query has softwareversion
crsctl start has
crsctl stop has [-f]
Why is the interconnect used for? How do you determine what protocol is being used for Interconnect traffic?
It is a private network which is used to ship data blocks from one instance to another for cache fusion. The physical data blocks as well as data dictionary blocks are shared across this interconnect.
One of the ways is to look at the database alert log for the time period when the database was started up.
What is split brain syndrome?
Will arise when two or more instances attempt to control a cluster database. In a two-node environment, one instance attempts to manage updates simultaneously while the other instance attempts to manage updates.