When we have lost or having a corruption issue on the OCR & VOTING Disk, we have follow the below procedure to bring it back.
BASIC CHECKS : To Check whether ocrcheck is corrupted or not
# ocrcheck
When using an ASM disk group for CRS there are typically 3 different types of files located in the OCR disk group that potentially need to be restored/recreated for function of the cluster.
- Oracle Cluster Registry file (OCR)
- Voting files
- Shared SPFILE for the ASM instances
In the below condition, you are not able to start the ASM instance because of the corruption issue
Step #1 Stop cluster on each node (Root user).
# crsctl stop crs -f
Step #2 we are starting the cluster in the excusive mode (Root user)
As root start GI in exclusive mode on one node only:
In 11201 RAC, we have to use below option to start the cluster in the exclusive mode.
# crsctl start crs -excl
In 11202 RAC, we have to use below option to start the cluster in the exclusive mode.
# crsctl start crs -excl -nocrs
Note: A new option '-nocrs' has been introduced with 11.2.0.2, which prevents the start of the ora.crsd resource. It is vital that this option is specified; otherwise the failure to start the ora.crsd resource will tear down ora.cluster_interconnect.haip, which in turn will cause ASM to crash.
If you don’t have the OCR DISK GROUP, then we need to create the disk group else move to restoring OCR DISK
Step #3 RESTORING OCR RESTORE
To Know the OCR Location on the cluster environment
$ cat /etc/oracle/ocr.loc -- In Linux
To Check whether ocrcheck is corrupted or not
# ocrcheck
Check whether ocrcheck is able to complete it successfully
OCR CHECK Ex
# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 4404
Available space (kbytes) : 257716
ID : 1306201859
Device/File Name : +OCR_VOTE
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
Note: 1) Check whether cluster registry integrity check is successful.
2) When you run as root user, logical corruption check will be bypassed.
If you run as oracle user, you can see this line end of the “ocrcheck” output.
“Logical corruption check bypassed due to non-privileged user”
To Know the OCR Location on the cluster environment
$ cat /etc/oracle/ocr.loc -- In Linux
If the OCR DISK corrupted, then perform the below steps
Locate OCR LOG file location
$GRID_HOME /log/<hostname>/client/ocrcheck_<pid>.log
Locate the latest automatic OCR backup
$GRID_HOME\bin\ocrconfig –showbackup
Restore the latest OCR backup(root user)
# ocrconfig -restore
$GRID_HOME/cdata/bhurac/backup00.ocr
bhurac è SCAN NAME for the cluster
Step #4 VOTING DISK RECREATE
Recreate the Voting file (root user)
The Voting file needs to be initialized in the CRS disk group
# crsctl replace votedisk +OCR_DISK
Note: 1) Above command will Re-create/move your voting disk in the specified ASM Disk Group, if you query the voting disk it will display your voting disk location in the DISK Group which has been specified above.
2) Voting File is no longer supported to take a manual backup of it with dd. Instead, the Voting File gets backed up automatically into the OCR.
Query Voting Disk location
# $GRID_HOME/bin/crsctl query css votedisk
Note: You cannot create more than 1 voting disk in the same or on another/different Disk group disk when using External Redundancy in 11.2. The rules are as follows:
External = 1 voting disk
Normal= 3 voting disk
High= 5 voting disk
NOTE: IN THIS SCNERIO, ONLY MY OCR & VOTING IS CORRUPED. IN ANOTHER POST, I WILL PROVIDE A STEPS TO RESTORE SPFILE. WHEN THERE IS NO COPY OF IT.
Step #5 Stop & start the cluster
Shutdown CRS è CRS is running in exclusive mode, it needs to be shutdown (Root User).
#
crsctl stop crs -f
Start CRS è Start the CRS in one node, if everything is ok then start the CRS in other nodes (root user).
#
crsctl start crs
CRS Status è Once started, you can check the status of the CRS(Root / Oracle user)
# crsctl stat res –t –init à if you are checking for one node
#
crsctl check cluster –all
à
if you are checking for entire cluster.
# $GRID_HOME/bin/crsctl status resource –t
à
Gives details information about each resource.
Important Tips
1) Oracle Clusterware 11g Release 2 backs up the OCR automatically every four hours on a schedule that is dependent on when the node started
- 4-hour backups (3 max) –backup00.ocr, backup01.ocr, and backup02.ocr.
- Daily backups (2 max) – day.ocr and day_.ocr
- Weekly backups (2 max) – week.ocr and week_.ocr
Ocrconfig –showbackup auto
Note: automatic backups will not occur, when the cluster is down2) Verifying OCR integrity of entire cluster nodes by running CVU command:
$ cluvfy comp ocr -n all -verbose
3) Oracle Local Repository (OLR), this repository designed to store information and profiles for local resources, resources that dedicated to particular node. It improves the performance of accessing local resources profile information, redundancy and manageability. In Grid Infrastructure RAC configuration there is usual one global shared OCR and OLR’s on each node. In Oracle Restart environment there is only OLR repository. In 11g R2 there is also new feature, Grid Plug and Play (GPNP), as the name implies, it helps to automate and simplify some of the aspects of grid administration. GPNP maintains profile, it is XML file that stores the configuration information of some components maintained by GPNP, for example vips and interconnect information is stored here.
HAPPY LEARNING!!!!!!!!!
No comments:
Post a Comment