Friday 5 May 2017

Oracle Database Administrator Interview Questions on RAC

Oracle RAC (Voting Disk ) FAQ's

1.What is Voting Disk?

Voting Disk is a file that maintains node membership details.
All members in the Cluster read and write Heartbeat Information in Voting Disk.

2. What Information is stored in Voting Disk?

Voting disks contain static and dynamic data.
Static data : Info about nodes in the cluster
Dynamic data : Disk heartbeat logging
It maintains and consists of important details about the cluster nodes membership, such as
- which node is part of the cluster,
- which node is joining the cluster, and
- which node is leaving the cluster.

3. Why do we need Voting Disk?

CSSD processes (Cluster Services Synchronization Daemon) monitor the health of  RAC nodes employing two distinct heart 

beats: Network heart beat and Disk heart beat. Healthy nodes will have continuous network and disk heartbeats exchanged 

between the  nodes. Break in heart beat indicates a possible error scenario. There are few different scenarios possible 

with missing heart beats:
1. Network heart beat is successful, but disk heart beat is missed.
2. Disk heart beat is successful, but network heart beat is missed.
3. Both heart beats failed.
In addition, with numerous nodes, there are other possible scenarios too. Few possible scenarios:
1. Nodes have split in to N sets of nodes, communicating within the set, but not with members in other set.
2. Just one node is unhealthy.
Nodes with quorum will maintain active membership of the cluster and other node(s) will be fenced/rebooted.

4.Why do we have odd number of voting disks?

Odd number of Voting disks are required to prevent Split Brain Syndrome.
A node must be able to access more than half of the voting disks in order to decide which node can be evicted incase of 

failure.

5. How to backup Voting Disk?

Prior to Oracle 11g R2 version, Voting Disk is backed up using dd command

From 11g R2, no need to manually backup voting disk.
It is automatically backup along with OCR  whenever there is a configuration change.

In 11g R2, Restoring Voting disk from a manually copied backupfile may prevent cluster services from starting up.

Oracle Database Administrator Interview Questions on RMAN

RMAN INTERVIEW QUESTIONS
========================

1.What is the difference between recovery catalog and Nocatalog backup in RMAN?

NoCatalog Backup:

ByDefault RMAN connects to the target database in Nocatalog Mode.
In Nocatalog Mode, Backup Information and Metadata related to RMAN is stored in target database controlfile.

Catalog Backup:

A recovery catalog is a schema created in a separate database that contains metadata obtained from the target control file.
In recovery catalog we can store rman scripts.

We can store metadata about multiple incarnations of a single target database in the catalog.
Recovery catalog is central and can have information of many databases.
If the control file is lost and must be restored from backup, the backup configuration information is available when the database is not mounted.

2.What are the differences between crosscheck and validate commands?

Validate command is to examine a backup set and report whether it can be restored. RMAN scans all of the backup pieces in the specified backup sets and looks at the checksum to verify that the contents are intact so that backup can be successfully restored if necessary.

Crosscheck command is to verify the status of backups and copies recorded in the RMAN repository against media such as disk or tape. The crosscheck command only processes files created on the same device type as the channel running crosscheck.

3.What is obsolete backup & expired backup?

A status of “expired” means that the backup piece or backup set is not found in the backup destination.

A status of “obsolete” means the backup piece is still available, but it is no longer needed. The backup piece is no longer needed since RMAN has been configured to no longer need this piece after so many days have elapsed, or so many backups have been performed.

4.What is the difference between hot backup & RMAN backup?

We need to put the database in Backup mode for Hot  Backup. RMAN Backup does not need the database to be in Backup mode.

5. Which Tables are have information required for RMAN list & report commands ?
 V$BACKUP_FILES and recovery catalog views e.g.,  RC_DATAFILE_COPY or RC_ARCHIVED_LOG.

Below tables contain RMAN Catalog information:

RC_DATABASE_INCARNATION 
RC_BACKUP_COPY_DETAILS
RC_BACKUP_CORRUPTION
RC_BACKUP-DATAFILE_SUMMARY




Friday 30 December 2016

What is OCR (Oracle Cluster Registry)?

Oracle Cluster Registry

OCR is the central repository for the Oracle cluster.
OCR has information about Port information,node failures, node reconfiguration,Database Status,Listener Status,Instance status, Network Status in real time.


Default Location of ocr.loc file on linux --  /etc/oracle/ocr.loc

Oracle Clusterware reads ocr.loc to determine the registry information and application resources that need to started up on each cluster node.

The below processes update OCR with realtime information.

1. CRSd updates  OCR with  information about node failure 
2. CSSd updates the OCR when a node is added or deleted
3. NetCA, DBCA, SRVCTL updates  OCR with  services related information

OCR also has information about  Software active version

Software  version can also be found using below command
$ crsctl query crs activeversion



OCR Backup and Restore



OCR is a binary file and cannot be edited manually.By Default, OCR is backed up automatically every 4 hours (from Instance Startup time).
OCR backups are cached across all nodes of a cluster.However,Only Master RAC node can write to OCR.
OCR backups are overwritten automatically. Clusterware maintains last 3 backups of OCR. 

Backup location can be defined using the below command.

$ocrconfig -backuploc <backup location>

It is recommended to use shared location for storing OCR backups.

To view the current OCR backups available, use the below commands

$ocrconfig –showbackup auto 

$ocrconfig –showbackup manual


After any clusterware related changes, OCR backup can be taken manually.

$ocrconfig –manualbackup


If OCR file is Corrupted or lost, CRS Instance may crash.

To restore OCR from Backup,follow the procedure below


1.Using ocrconfig identify the latest OCR backup using command below

$ocrconfig -showbackup
2.Shutdown CRS Services on all RAC nodes
$crsctl stop crs
  
3.Start CRS on one node in exclusive mode
$crsctl start crs -excl 
Stop crsd , if it is running

$ crsctl stop resource ora.crsd -init

4.Restore the OCR 
$ocrconfig –restore <backuplocation/backupfilename>
   
5. Verify the integrity of OCR
$ocrcheck
6. Shutdown  CRS Services on the node where you had started in exclusive mode
$crsctl stop crs 


7.Start CRS on all RAC nodes 
$crsctl start crs
8. Verify the restored OCR 
$cluvfy comp ocr –n all -verbose

Weblogic Managed Servers Crashed due to weblogic.socket.MaxMessageSizeExceededException

ISSUE
=====

Weblogic Managed Servers are crashing frequently .

Admin Server Logfile has the below error message.

 <IOException occurred on socket: Socket[addr=test1.localhost.com/180.81.41.219,port=1345,localport=47093]
weblogic.socket.MaxMessageSizeExceededException: Incoming message of size: '10000080' bytes exceeds the configured maximum of: '10000000' bytes for protocol: 't3'.
weblogic.socket.MaxMessageSizeExceededException: Incoming message of size: '10000080' bytes exceeds the configured maximum of: '10000000' bytes for protocol: 't3'
at weblogic.socket.BaseAbstractMuxableSocket.incrementBufferOffset(BaseAbstractMuxableSocket.java:230)
at weblogic.rjvm.t3.MuxableSocketT3.incrementBufferOffset(MuxableSocketT3.java:351)
at weblogic.socket.SocketMuxer.readFromSocket(SocketMuxer.java:989)
at weblogic.socket.SocketMuxer.readReadySocketOnce(SocketMuxer.java:920)
at weblogic.socket.SocketMuxer.readReadySocket(SocketMuxer.java:897)
at weblogic.socket.EPollSocketMuxer.dataReceived(EPollSocketMuxer.java:215)
at weblogic.socket.EPollSocketMuxer.processSockets(EPollSocketMuxer.java:177)
at weblogic.socket.SocketReaderRequest.run(SocketReaderRequest.java:29)
at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:43)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:145)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:117) 

Due to above error, Admin Server is going into WARNING Status and Managed Servers are disconnected from ADMIN Server.


SOLUTION
========

1. Increase Maxmessage size from WLS console


Login to weblogic console url as weblogic admin user


WebLogic Console -> Servers ->Admin Server ->  Protocols -> General -> Max Message 
Size

Increase Max Message Size to  20000000 (Any value higher than Incoming message size from error message)

2.Restart MT Services