Thursday, November 24, 2011

RESTORING OCR DISK & VOTING DISK ON ASM DISK GROUP


RECOVERING OCR DISK & VOTING DISK ON ASM DISK GROUP FROM CORRUPTION OR LOSS

When we have lost or having a corruption issue on the OCR & VOTING Disk, we have follow the below procedure to bring it back.

When using an ASM disk group for CRS there are typically 3 different types of files located in the disk group that potentially need to be restored/recreated for function of the cluster.
  • Oracle Cluster Registry file (OCR)
  • Voting files
  •  Shared SPFILE for the ASM instances
In this scenario, we are trying to restore the corrupted OCR Disk & Voting Disk from the backup.


Step #1 Stop cluster on each node(Root user).

# crsctl stop crs -f

Step #2 we are starting the cluster in the excusive mode(Root user)

As root start GI in exclusive mode on one node only:
In 11201 RAC, we have to use below option to start the cluster in the exclusive mode.
# crsctl start crs -excl

In 11202 RAC, we have to use below option to start the cluster in the exclusive mode.
# crsctl start crs -excl -nocrs

Note: A new option '-nocrs' has been introduced with  11.2.0.2, which prevents the start of the ora.crsd resource. It is vital that this option is specified; otherwise the failure to start the ora.crsd resource will tear down ora.cluster_interconnect.haip, which in turn will cause ASM to crash.


If you don’t have the OCR DISK GROUP, then create it else move to restoring OCR DISK


Step #3 OCR RESTORE

To Know the OCR Location on the cluster environment
$ cat /etc/oracle/ocr.loc  -- In Linux

To Check whether ocrcheck is corrupted or not

# ocrcheck

Check whether ocrcheck is able to complete it successfully

OCR CHECK Ex
# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       4404
         Available space (kbytes) :     257716
         ID                       : 1306201859
         Device/File Name         :  +OCR_VOTE
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded

         Logical corruption check succeeded
        

Note: 1) Check whether cluster registry integrity check is successful.
          2) When you run as oracle user, logical corruption check will be bypassed. You can see this line end of the “ocrcheck” output.
“Logical corruption check bypassed due to non-privileged user”


To Know the OCR Location on the cluster environment
$ cat /etc/oracle/ocr.loc  -- In Linux
If the OCR DISK corrupted, then perform the below steps

Locate OCR LOG file location
$GRID_HOME /log/<hostname>/client/ocrcheck_<pid>.log
Locate the latest automatic OCR backup
$GRID_HOME\bin\ocrconfig –showbackup

Restore the latest OCR backup(root user)
# ocrconfig -restore $GRID_HOME/cdata/racsapie1/backup00.ocr
racsapie1 è SCAN NAME for the cluster

Step #4 VOTING DISK RECREATE
           
Recreate the Voting file (root user)
The Voting file needs to be initialized in the CRS disk group
# crsctl replace votedisk +OCR_DISK
Note: 1) Above command will Re-create/move your voting disk in the specified ASM Disk Group, if you query the voting disk it will display your voting disk location in the DISK Group which has been specified above.
2)  Voting File is that it is no longer supported to take a manual backup of it with dd.  Instead, the Voting File gets backed up automatically into the OCR.

Query Voting Disk location

# $GRID_HOME/bin/crsctl query css votedisk

Note: You cannot create more than 1 voting disk in the same or on another/different Disk group disk when using External Redundancy in 11.2. The rules are as follows:
External = 1 voting disk
Normal= 3 voting disk
High= 5 voting disk

Step #5 Stop & start the cluster

Shutdown CRS è CRS is running in exclusive mode, it needs to be shutdown (Root User).

# crsctl stop crs -f

Start CRS è Start the CRS in one node, if everything is ok then start the CRS in other nodes (root user).

# crsctl start crs

CRS Status è Once it is start, you can check the status of the CRS(Root / Oracle user)

# crsctl stat res –t –init      à if you are checking for one node
# crsctl check cluster –all  à if you are checking for entire cluster.



Important Tips

Oracle Clusterware 11g Release 2 backs up the OCR automatically every four hours on a schedule that is dependent on when the node started
  • 4-hour backups (3 max) –backup00.ocr, backup01.ocr, and backup02.ocr.
  • Daily backups (2 max) – day.ocr and day_.ocr
  • Weekly backups (2 max) – week.ocr and week_.ocr
You can use the ocrconfig command to view the current OCR backups as seen in this
Ocrconfig –showbackup auto
 
Note: automatic backups will not occur, when the cluster is down
 
Verifying OCR integrity of all of the cluster nodes by running the following CVU command:
$ cluvfy comp ocr -n all -verbose


Please provide your valuable comments. Happy Learning!!!!!
 

4 comments:

  1. This is very good article to restore the OCR file with backup

    ReplyDelete
  2. Very knowledgeable........informative..
    Thanks

    ReplyDelete
    Replies
    1. Nice, a lot of information thanks so much.

      Delete