Oracle Apps Database Administration Blog

Friday 5 May 2017

Oracle Database Administrator Interview Questions on RAC

Oracle RAC (Voting Disk ) FAQ's

1.What is Voting Disk?

Voting Disk is a file that maintains node membership details.
All members in the Cluster read and write Heartbeat Information in Voting Disk.

2. What Information is stored in Voting Disk?

Voting disks contain static and dynamic data.
Static data : Info about nodes in the cluster
Dynamic data : Disk heartbeat logging
It maintains and consists of important details about the cluster nodes membership, such as
- which node is part of the cluster,
- which node is joining the cluster, and
- which node is leaving the cluster.

3. Why do we need Voting Disk?

CSSD processes (Cluster Services Synchronization Daemon) monitor the health of RAC nodes employing two distinct heart

beats: Network heart beat and Disk heart beat. Healthy nodes will have continuous network and disk heartbeats exchanged

between the nodes. Break in heart beat indicates a possible error scenario. There are few different scenarios possible

with missing heart beats:
1. Network heart beat is successful, but disk heart beat is missed.
2. Disk heart beat is successful, but network heart beat is missed.
3. Both heart beats failed.
In addition, with numerous nodes, there are other possible scenarios too. Few possible scenarios:
1. Nodes have split in to N sets of nodes, communicating within the set, but not with members in other set.
2. Just one node is unhealthy.
Nodes with quorum will maintain active membership of the cluster and other node(s) will be fenced/rebooted.

4.Why do we have odd number of voting disks?

Odd number of Voting disks are required to prevent Split Brain Syndrome.
A node must be able to access more than half of the voting disks in order to decide which node can be evicted incase of

failure.

5. How to backup Voting Disk?

Prior to Oracle 11g R2 version, Voting Disk is backed up using dd command

From 11g R2, no need to manually backup voting disk.
It is automatically backup along with OCR whenever there is a configuration change.

In 11g R2, Restoring Voting disk from a manually copied backupfile may prevent cluster services from starting up.

Oracle Database Administrator Interview Questions on RMAN

RMAN INTERVIEW QUESTIONS
========================

1.What is the difference between recovery catalog and Nocatalog backup in RMAN?

NoCatalog Backup:

ByDefault RMAN connects to the target database in Nocatalog Mode.
In Nocatalog Mode, Backup Information and Metadata related to RMAN is stored in target database controlfile.

Catalog Backup:

A recovery catalog is a schema created in a separate database that contains metadata obtained from the target control file.
In recovery catalog we can store rman scripts.

We can store metadata about multiple incarnations of a single target database in the catalog.
Recovery catalog is central and can have information of many databases.
If the control file is lost and must be restored from backup, the backup configuration information is available when the database is not mounted.

2.What are the differences between crosscheck and validate commands?

Validate command is to examine a backup set and report whether it can be restored. RMAN scans all of the backup pieces in the specified backup sets and looks at the checksum to verify that the contents are intact so that backup can be successfully restored if necessary.

Crosscheck command is to verify the status of backups and copies recorded in the RMAN repository against media such as disk or tape. The crosscheck command only processes files created on the same device type as the channel running crosscheck.

3.What is obsolete backup & expired backup?

A status of “expired” means that the backup piece or backup set is not found in the backup destination.

A status of “obsolete” means the backup piece is still available, but it is no longer needed. The backup piece is no longer needed since RMAN has been configured to no longer need this piece after so many days have elapsed, or so many backups have been performed.

4.What is the difference between hot backup & RMAN backup?

We need to put the database in Backup mode for Hot Backup. RMAN Backup does not need the database to be in Backup mode.

5. Which Tables are have information required for RMAN list & report commands ?
V$BACKUP_FILES and recovery catalog views e.g., RC_DATAFILE_COPY or RC_ARCHIVED_LOG.

Below tables contain RMAN Catalog information:

RC_DATABASE_INCARNATION
RC_BACKUP_COPY_DETAILS
RC_BACKUP_CORRUPTION
RC_BACKUP-DATAFILE_SUMMARY

Friday 30 December 2016

What is OCR (Oracle Cluster Registry)?

Oracle Cluster Registry

OCR is the central repository for the Oracle cluster.

OCR has information about Port information,node failures, node reconfiguration,Database Status,Listener Status,Instance status, Network Status in real time.

Default Location of ocr.loc file on linux -- /etc/oracle/ocr.loc

Oracle Clusterware reads ocr.loc to determine the registry information and application resources that need to started up on each cluster node.

The below processes update OCR with realtime information.

1. CRSd updates OCR with information about node failure

2. CSSd updates the OCR when a node is added or deleted

3. NetCA, DBCA, SRVCTL updates OCR with services related information

OCR also has information about Software active version

Software version can also be found using below command

$ crsctl query crs activeversion

OCR Backup and Restore

OCR is a binary file and cannot be edited manually.By Default, OCR is backed up automatically every 4 hours (from Instance Startup time).

OCR backups are cached across all nodes of a cluster.However,Only Master RAC node can write to OCR.

OCR backups are overwritten automatically. Clusterware maintains last 3 backups of OCR.

Backup location can be defined using the below command.

$ocrconfig -backuploc <backup location>

It is recommended to use shared location for storing OCR backups.

To view the current OCR backups available, use the below commands

$ocrconfig –showbackup auto

$ocrconfig –showbackup manual

After any clusterware related changes, OCR backup can be taken manually.

$ocrconfig –manualbackup

If OCR file is Corrupted or lost, CRS Instance may crash.

To restore OCR from Backup,follow the procedure below

1.Using ocrconfig identify the latest OCR backup using command below

$ocrconfig -showbackup

2.Shutdown CRS Services on all RAC nodes

$crsctl stop crs

3.Start CRS on one node in exclusive mode

$crsctl start crs -excl

Stop crsd , if it is running

$ crsctl stop resource ora.crsd -init

4.Restore the OCR

$ocrconfig –restore <backuplocation/backupfilename>

5. Verify the integrity of OCR

$ocrcheck

6. Shutdown CRS Services on the node where you had started in exclusive mode

$crsctl stop crs

7.Start CRS on all RAC nodes

$crsctl start crs

8. Verify the restored OCR

$cluvfy comp ocr –n all -verbose

Weblogic Managed Servers Crashed due to weblogic.socket.MaxMessageSizeExceededException

ISSUE
=====

Weblogic Managed Servers are crashing frequently .

Admin Server Logfile has the below error message.

<IOException occurred on socket: Socket[addr=test1.localhost.com/180.81.41.219,port=1345,localport=47093]
weblogic.socket.MaxMessageSizeExceededException: Incoming message of size: '10000080' bytes exceeds the configured maximum of: '10000000' bytes for protocol: 't3'.
weblogic.socket.MaxMessageSizeExceededException: Incoming message of size: '10000080' bytes exceeds the configured maximum of: '10000000' bytes for protocol: 't3'
at weblogic.socket.BaseAbstractMuxableSocket.incrementBufferOffset(BaseAbstractMuxableSocket.java:230)
at weblogic.rjvm.t3.MuxableSocketT3.incrementBufferOffset(MuxableSocketT3.java:351)
at weblogic.socket.SocketMuxer.readFromSocket(SocketMuxer.java:989)
at weblogic.socket.SocketMuxer.readReadySocketOnce(SocketMuxer.java:920)
at weblogic.socket.SocketMuxer.readReadySocket(SocketMuxer.java:897)
at weblogic.socket.EPollSocketMuxer.dataReceived(EPollSocketMuxer.java:215)
at weblogic.socket.EPollSocketMuxer.processSockets(EPollSocketMuxer.java:177)
at weblogic.socket.SocketReaderRequest.run(SocketReaderRequest.java:29)
at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:43)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:145)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:117)

Due to above error, Admin Server is going into WARNING Status and Managed Servers are disconnected from ADMIN Server.

SOLUTION
========

1. Increase Maxmessage size from WLS console

Login to weblogic console url as weblogic admin user

WebLogic Console -> Servers ->Admin Server -> Protocols -> General -> Max Message
Size

Increase Max Message Size to 20000000 (Any value higher than Incoming message size from error message)

2.Restart MT Services

Tuesday 23 August 2016

How to delete Managed Servers in R12.2

The procedure to increase/decrease the number of oacore/oafm servers in R12.2.X is different from R12.1.x.
Because in R12.2.x OPMN is replaced by Weblogic Server.

Steps to delete a managed Server in R12.2.x

1. Ensure there is no active adop cycle.

2. Shut down the managed server which needs to be deleted from the cluster/domain.

$ADMIN_SCRIPTS_HOME/admanagedsrvctl.sh stop <MANAGED SERVER NAME>

Example:

$ADMIN_SCRIPTS_HOME/admanagedsrvctl.sh stop oacore_server5

3. Execute the script adProvisionEBS.pl on the host where the managed server was created.

Source the RUN filesystem before running the command.

perl $AD_TOP/patch/115/bin/adProvisionEBS.pl \
ebs-delete-managedserver \
-contextfile=<CONTEXT_FILE> -managedsrvname=<MANAGED_SERVER_NAME> \
-servicetype=<SERVICE_TYPE> -logfile=<LOGFILE>
Sample Output:

bash-4.1$ perl $AD_TOP/patch/115/bin/adProvisionEBS.pl \
> ebs-delete-managedserver \
> -contextfile=$CONTEXT_FILE -managedsrvname=oacore_server6 \
> -servicetype=oacore -logfile=$APPLRGF/TXK/delMS_oacoreserver6.log

Enter the APPS Schema password:
Enter the WebLogic AdminServer password:

ManagedServer oacore_server5 deleted successfully.

The above command deletes the managed server and also updates related parameters in context file

To Verify the entries are deleted in Context File,check the output of below command

grep -i oacore_server6 $CONTEXT_FILE

4.Execute the following command to delete details of the managed server from the OHS configuration files mod_wl_ohs.conf and apps.conf

$ perl <FND_TOP>/patch/115/bin/txkSetAppsConf.pl \
-contextfile=<CONTEXT_FILE> \
-configoption=removeMS \
-oacore=<host>.<domain>:<port> \
-oafm=<host>.<domain>:<port> \
-forms=<host>.<domain>:<port> \
-formsc4ws=<host>.<domain>:<port> \
-ekanban=<host>.<domain>:<port> \
-accessgate=<host>.<domain>:<port> \
-yms=<host>.<domain>:<port>

Sample Output:

bash-4.1$ perl $FND_TOP/patch/115/bin/txkSetAppsConf.pl \
-contextfile=$CONTEXT_FILE \
-configoption=removeMS -oacore=localhost.localdomain.com:7205

*** LOG FILE:
/test12/applmgr/fs1/inst/apps/test12_localhost/logs/appl/rgf/TXK/txkSetAppsConf_08009354.log

5. Restart HTTP Server

cd $ADMIN_SCRIPTS_HOME

adapcctl.sh stop

adapcctl.sh start

6. Repeat the above steps on PATCH filesystem.

adpreclone.pl error while running copyBinary.sh on a 12.2 Environment

In a 12.2.5 environment adpreclone.pl errors out while running copyBinary.sh adpreclone.pl on appsTier

$perl adpreclone.pl appsTier

                     Copyright (c) 2011, 2014 Oracle Corporation
                        Redwood Shores, California, USA

                        Oracle E-Business Suite Rapid Clone

                                 Version 12.2

                      adpreclone Version 120.31.12020000.20

Enter the APPS User Password:
Enter the Weblogic AdminServer password :

Checking the status of the Oracle WebLogic Administration Server....

Running perl /test12/applmgr/fs1/EBSapps/appl/ad/12.0.0/patch/115/bin/adProvisionEBS.pl ebs-get-serverstatus -contextfile=/test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml -servername=AdminServer -promptmsg=hide

The Oracle WebLogic Administration Server is up.

Running:
perl /test12/applmgr/fs1/EBSapps/appl/ad/12.0.0/bin/adclone.pl java=/test12/applmgr/fs1/EBSapps/comn/util/jdk64 mode=stage stage=/test12/applmgr/fs1/EBSapps/comn/clone component=appsTier method= appctx=/test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml showProgress

Setting the wls environment

Beginning application tier Stage - Thu Aug 18 02:43:26 2016

/test12/applmgr/fs1/EBSapps/comn/util/jdk64/bin/java -Xmx600M -DCONTEXT_VALIDATED=false -Doracle.installer.oui_loc=/oui -classpath /test12/applmgr/fs1/FMW_Home/webtier/lib/xmlparserv2.jar:/test12/applmgr/fs1/FMW_Home/webtier/jdbc/lib/ojdbc6.jar:/test12/applmgr/fs1/EBSapps/comn/java/classes:/test12/applmgr/fs1/FMW_Home/webtier/oui/jlib/OraInstaller.jar:/test12/applmgr/fs1/FMW_Home/webtier/oui/jlib/ewt3.jar:/test12/applmgr/fs1/FMW_Home/webtier/oui/jlib/share.jar:/test12/applmgr/fs1/FMW_Home/webtier/../Oracle_EBS-app1/oui/jlib/srvm.jar:/test12/applmgr/fs1/FMW_Home/webtier/jlib/ojmisc.jar:/test12/applmgr/fs1/FMW_Home/wlserver_10.3/server/lib/weblogic.jar:/test12/applmgr/fs1/FMW_Home/oracle_common/jlib/obfuscatepassword.jar oracle.apps.ad.clone.StageAppsTier -e /test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml -stage /test12/applmgr/fs1/EBSapps/comn/clone -tmp /tmp -method CUSTOM   -showProgress -nopromptmsg

Log file located at /test12/applmgr/fs1/inst/apps/test12_vmohswrly128/admin/log/clone/StageAppsTier_08180243.log

\     20% completed
ERROR while running Stage...
Thu Aug 18 02:50:43 2016

ERROR while running perl /test12/applmgr/fs1/EBSapps/appl/ad/12.0.0/bin/adclone.pl java=/test12/applmgr/fs1/EBSapps/comn/util/jdk64 mode=stage stage=/test12/applmgr/fs1/EBSapps/comn/clone component=appsTier method= appctx=/test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml showProgress ...
Thu Aug 18 02:50:43 2016
[aptest12@vmohswrly128 <RUN> ]

adpreclone log files shows the below error message

Error Message:

START: Creating FMW archive.
Running /test12/applmgr/fs1/FMW_Home/oracle_common/bin/copyBinary.sh -javaHome /test12/applmgr/fs1/EBSapps/comn/clone/FMW/t2pjdk -al /test12/applmgr/fs1/EBSapps/comn/clone/FMW/FMW_Home.jar -smw /test12/applmgr/fs1/FMW_Home -ldl /test12/applmgr/fs1/inst/apps/test12_vmohswrly128/admin/log/clone/fmwT2PStage -invPtrLoc /etc/oraInst.loc -silent true -debug true
Script Executed in 298222 milliseconds, returning status 1
ERROR: Script failed, exit code 1

Solution:

1. set the environment variable T2P_JAVA_OPTIONS as given below

$export T2P_JAVA_OPTIONS="-Djava.io.tmpdir=/test12/applmgr/temp"

2. Make sure enough free space is available in the temp directory provided above

3.Rerun adpreclone.pl on appsTier

Tuesday 5 July 2016

ADOP Prepare Phase Fails with ORA-31011: XML parsing failed

While applying Patch on a test instance (R12.2.5), Prepare phase failed with error ORA-31011: XML parsing failed

Complete error message from adop logfile
==================================

[aptest12@localhost <RUN> ]$ adop phase=prepare

Enter the APPS password:
Enter the SYSTEM password:
Enter the WLSADMIN password:

Validating credentials.

Initializing.
Run Edition context : /test12/applmgr/fs2/inst/apps/test12_localhost/appl/admin/TEST12_localhost.xml
Patch edition context: /test12/applmgr/fs1/inst/apps/test12_localhost/appl/admin/TEST12_localhost.xml
Patch file system free space: 1130.34 GB

Validating system setup.
[ERROR] Failed to execute SQL statement:
select node_name from FND_OAM_CONTEXT_FILES
where NAME not in ('TEMPLATE','METADATA','config.txt') and
CTX_TYPE='A' and (status is null or upper(status) in ('S','F'))
and EXTRACTVALUE(XMLType(TEXT),'//file_edition_type') = 'run'
and EXTRACTVALUE(XMLType(TEXT),'//oa_service_group_status[@oa_var=''s_web_admin_status'']')='enabled'
and EXTRACTVALUE(XMLType(TEXT),'//oa_service_list/oa_service[@type=''admin_server'']/oa_service_status')='enabled'

[ERROR] Error Message:
[ERROR] ORA-31011: XML parsing failed
[ERROR] ORA-19202: Error occurred in XML processing
[ERROR] LPX-00229: input source is empty
[ERROR] ORA-06512: at "SYS.XMLTYPE", line 272
[ERROR] ORA-06512: at line 1 (DBD ERROR: error possibly near <*> indicator at char 208 in ' select node_name from FND_OAM_CONTEXT_FILES
[ERROR] where NAME not in ('TEMPLATE','METADATA','config.txt') and
[ERROR] CTX_TYPE='A' and (status is null or upper(status) in ('S','F'))
[ERROR] and EXTRACTVALUE(<*>XMLType(TEXT),'//file_edition_type') = 'run'
[ERROR] and EXTRACTVALUE(XMLType(TEXT),'//oa_service_group_status[@oa_var=''s_web_admin_status'']')='enabled'
[ERROR] and EXTRACTVALUE(XMLType(TEXT),'//oa_service_list/oa_service[@type=''admin_server'']/oa_service_status')='enabled'
[ERROR] ')
[UNEXPECTED]Error occurred identifying Admin server node

[STATEMENT] Please run adopscanlog utility, using the command

"adopscanlog -latest=yes"

to get the list of the log files along with snippet of the error message corresponding to each log file.

Cause:

fnd_oam_context_files table has incorrect information or the table is corrupted

Solution:

1. Login to sqlplus as applsys and take backup of fnd_oam_context_files table.

SQL> conn applsys
Enter password:
Connected.
SQL> create table fnd_oam_context_files_bkp as select * from fnd_oam_context_files;

Table created.

2. Truncate Table fnd_oam_context_files

SQL> truncate table fnd_oam_context_files;

Table truncated.

3. Bring down MT Services

4. Run Autoconfig to populate fnd_oam_context_files with right information.