Friday 30 December 2016

What is OCR (Oracle Cluster Registry)?

Oracle Cluster Registry

OCR is the central repository for the Oracle cluster.
OCR has information about Port information,node failures, node reconfiguration,Database Status,Listener Status,Instance status, Network Status in real time.


Default Location of ocr.loc file on linux --  /etc/oracle/ocr.loc

Oracle Clusterware reads ocr.loc to determine the registry information and application resources that need to started up on each cluster node.

The below processes update OCR with realtime information.

1. CRSd updates  OCR with  information about node failure 
2. CSSd updates the OCR when a node is added or deleted
3. NetCA, DBCA, SRVCTL updates  OCR with  services related information

OCR also has information about  Software active version

Software  version can also be found using below command
$ crsctl query crs activeversion



OCR Backup and Restore



OCR is a binary file and cannot be edited manually.By Default, OCR is backed up automatically every 4 hours (from Instance Startup time).
OCR backups are cached across all nodes of a cluster.However,Only Master RAC node can write to OCR.
OCR backups are overwritten automatically. Clusterware maintains last 3 backups of OCR. 

Backup location can be defined using the below command.

$ocrconfig -backuploc <backup location>

It is recommended to use shared location for storing OCR backups.

To view the current OCR backups available, use the below commands

$ocrconfig –showbackup auto 

$ocrconfig –showbackup manual


After any clusterware related changes, OCR backup can be taken manually.

$ocrconfig –manualbackup


If OCR file is Corrupted or lost, CRS Instance may crash.

To restore OCR from Backup,follow the procedure below


1.Using ocrconfig identify the latest OCR backup using command below

$ocrconfig -showbackup
2.Shutdown CRS Services on all RAC nodes
$crsctl stop crs
  
3.Start CRS on one node in exclusive mode
$crsctl start crs -excl 
Stop crsd , if it is running

$ crsctl stop resource ora.crsd -init

4.Restore the OCR 
$ocrconfig –restore <backuplocation/backupfilename>
   
5. Verify the integrity of OCR
$ocrcheck
6. Shutdown  CRS Services on the node where you had started in exclusive mode
$crsctl stop crs 


7.Start CRS on all RAC nodes 
$crsctl start crs
8. Verify the restored OCR 
$cluvfy comp ocr –n all -verbose

Weblogic Managed Servers Crashed due to weblogic.socket.MaxMessageSizeExceededException

ISSUE
=====

Weblogic Managed Servers are crashing frequently .

Admin Server Logfile has the below error message.

 <IOException occurred on socket: Socket[addr=test1.localhost.com/180.81.41.219,port=1345,localport=47093]
weblogic.socket.MaxMessageSizeExceededException: Incoming message of size: '10000080' bytes exceeds the configured maximum of: '10000000' bytes for protocol: 't3'.
weblogic.socket.MaxMessageSizeExceededException: Incoming message of size: '10000080' bytes exceeds the configured maximum of: '10000000' bytes for protocol: 't3'
at weblogic.socket.BaseAbstractMuxableSocket.incrementBufferOffset(BaseAbstractMuxableSocket.java:230)
at weblogic.rjvm.t3.MuxableSocketT3.incrementBufferOffset(MuxableSocketT3.java:351)
at weblogic.socket.SocketMuxer.readFromSocket(SocketMuxer.java:989)
at weblogic.socket.SocketMuxer.readReadySocketOnce(SocketMuxer.java:920)
at weblogic.socket.SocketMuxer.readReadySocket(SocketMuxer.java:897)
at weblogic.socket.EPollSocketMuxer.dataReceived(EPollSocketMuxer.java:215)
at weblogic.socket.EPollSocketMuxer.processSockets(EPollSocketMuxer.java:177)
at weblogic.socket.SocketReaderRequest.run(SocketReaderRequest.java:29)
at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:43)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:145)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:117) 

Due to above error, Admin Server is going into WARNING Status and Managed Servers are disconnected from ADMIN Server.


SOLUTION
========

1. Increase Maxmessage size from WLS console


Login to weblogic console url as weblogic admin user


WebLogic Console -> Servers ->Admin Server ->  Protocols -> General -> Max Message 
Size

Increase Max Message Size to  20000000 (Any value higher than Incoming message size from error message)

2.Restart MT Services

Tuesday 23 August 2016

How to delete Managed Servers in R12.2

The procedure to increase/decrease the number of oacore/oafm servers in R12.2.X is different from R12.1.x.
Because in R12.2.x OPMN is replaced by Weblogic Server.


Steps to delete a managed Server in R12.2.x

1. Ensure there is no active adop cycle.

2. Shut down the managed server which needs to be deleted from the cluster/domain.

 $ADMIN_SCRIPTS_HOME/admanagedsrvctl.sh stop <MANAGED SERVER NAME>

Example:

$ADMIN_SCRIPTS_HOME/admanagedsrvctl.sh stop oacore_server5


3. Execute the script adProvisionEBS.pl  on the host where the managed server was created.

Source the RUN filesystem before running the command.

 perl $AD_TOP/patch/115/bin/adProvisionEBS.pl \
ebs-delete-managedserver \
-contextfile=<CONTEXT_FILE> -managedsrvname=<MANAGED_SERVER_NAME> \
-servicetype=<SERVICE_TYPE> -logfile=<LOGFILE>

Sample Output:

bash-4.1$ perl $AD_TOP/patch/115/bin/adProvisionEBS.pl \
> ebs-delete-managedserver \
> -contextfile=$CONTEXT_FILE -managedsrvname=oacore_server6 \
> -servicetype=oacore -logfile=$APPLRGF/TXK/delMS_oacoreserver6.log

Enter the APPS Schema password:
Enter the WebLogic AdminServer password:

ManagedServer oacore_server5 deleted successfully.


The above command deletes the managed server and also updates related parameters in context file


To Verify the entries are deleted in Context File,check the output of below command

grep -i oacore_server6 $CONTEXT_FILE


4.Execute the following command to delete details of the managed server from the OHS configuration files mod_wl_ohs.conf and apps.conf


$ perl <FND_TOP>/patch/115/bin/txkSetAppsConf.pl \
-contextfile=<CONTEXT_FILE> \
-configoption=removeMS \
-oacore=<host>.<domain>:<port> \
-oafm=<host>.<domain>:<port> \
-forms=<host>.<domain>:<port> \
-formsc4ws=<host>.<domain>:<port> \
-ekanban=<host>.<domain>:<port> \
-accessgate=<host>.<domain>:<port> \
-yms=<host>.<domain>:<port>




Sample Output:

bash-4.1$ perl $FND_TOP/patch/115/bin/txkSetAppsConf.pl \
 -contextfile=$CONTEXT_FILE \
-configoption=removeMS -oacore=localhost.localdomain.com:7205


*** LOG FILE:
/test12/applmgr/fs1/inst/apps/test12_localhost/logs/appl/rgf/TXK/txkSetAppsConf_08009354.log


5. Restart HTTP Server

cd $ADMIN_SCRIPTS_HOME

adapcctl.sh stop

adapcctl.sh start


6. Repeat the above steps on PATCH filesystem.

adpreclone.pl error while running copyBinary.sh on a 12.2 Environment

In a 12.2.5 environment adpreclone.pl errors out while running copyBinary.sh adpreclone.pl on appsTier



$perl adpreclone.pl appsTier

                     Copyright (c) 2011, 2014 Oracle Corporation
                        Redwood Shores, California, USA

                        Oracle E-Business Suite Rapid Clone

                                 Version 12.2

                      adpreclone Version 120.31.12020000.20

Enter the APPS User Password:
Enter the Weblogic AdminServer password :

 Checking the status of the Oracle WebLogic Administration Server....

 Running perl /test12/applmgr/fs1/EBSapps/appl/ad/12.0.0/patch/115/bin/adProvisionEBS.pl ebs-get-serverstatus -contextfile=/test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml -servername=AdminServer -promptmsg=hide

The Oracle WebLogic Administration Server is up.

Running:
perl /test12/applmgr/fs1/EBSapps/appl/ad/12.0.0/bin/adclone.pl java=/test12/applmgr/fs1/EBSapps/comn/util/jdk64 mode=stage stage=/test12/applmgr/fs1/EBSapps/comn/clone component=appsTier method= appctx=/test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml showProgress



 Setting the wls environment

Beginning application tier Stage - Thu Aug 18 02:43:26 2016

/test12/applmgr/fs1/EBSapps/comn/util/jdk64/bin/java -Xmx600M -DCONTEXT_VALIDATED=false -Doracle.installer.oui_loc=/oui -classpath /test12/applmgr/fs1/FMW_Home/webtier/lib/xmlparserv2.jar:/test12/applmgr/fs1/FMW_Home/webtier/jdbc/lib/ojdbc6.jar:/test12/applmgr/fs1/EBSapps/comn/java/classes:/test12/applmgr/fs1/FMW_Home/webtier/oui/jlib/OraInstaller.jar:/test12/applmgr/fs1/FMW_Home/webtier/oui/jlib/ewt3.jar:/test12/applmgr/fs1/FMW_Home/webtier/oui/jlib/share.jar:/test12/applmgr/fs1/FMW_Home/webtier/../Oracle_EBS-app1/oui/jlib/srvm.jar:/test12/applmgr/fs1/FMW_Home/webtier/jlib/ojmisc.jar:/test12/applmgr/fs1/FMW_Home/wlserver_10.3/server/lib/weblogic.jar:/test12/applmgr/fs1/FMW_Home/oracle_common/jlib/obfuscatepassword.jar  oracle.apps.ad.clone.StageAppsTier -e /test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml -stage /test12/applmgr/fs1/EBSapps/comn/clone -tmp /tmp -method CUSTOM   -showProgress -nopromptmsg

Log file located at /test12/applmgr/fs1/inst/apps/test12_vmohswrly128/admin/log/clone/StageAppsTier_08180243.log

  \     20% completed
ERROR while running Stage...
Thu Aug 18 02:50:43 2016

ERROR while running perl /test12/applmgr/fs1/EBSapps/appl/ad/12.0.0/bin/adclone.pl java=/test12/applmgr/fs1/EBSapps/comn/util/jdk64 mode=stage stage=/test12/applmgr/fs1/EBSapps/comn/clone component=appsTier method= appctx=/test12/applmgr/fs1/inst/apps/test12_vmohswrly128/appl/admin/test12_vmohswrly128.xml showProgress ...
Thu Aug 18 02:50:43 2016
[aptest12@vmohswrly128 <RUN> ]


adpreclone log files shows the below error message

Error Message:

START: Creating FMW archive.
Running /test12/applmgr/fs1/FMW_Home/oracle_common/bin/copyBinary.sh -javaHome /test12/applmgr/fs1/EBSapps/comn/clone/FMW/t2pjdk -al /test12/applmgr/fs1/EBSapps/comn/clone/FMW/FMW_Home.jar -smw /test12/applmgr/fs1/FMW_Home -ldl /test12/applmgr/fs1/inst/apps/test12_vmohswrly128/admin/log/clone/fmwT2PStage -invPtrLoc /etc/oraInst.loc -silent true -debug true
Script Executed in 298222 milliseconds, returning status 1
ERROR: Script failed, exit code 1

Solution:

1. set the environment variable T2P_JAVA_OPTIONS as given below

$export T2P_JAVA_OPTIONS="-Djava.io.tmpdir=/test12/applmgr/temp"

2. Make sure enough free space is available in the temp directory provided above

3.Rerun adpreclone.pl on appsTier



Tuesday 5 July 2016

ADOP Prepare Phase Fails with ORA-31011: XML parsing failed

While applying Patch on a test instance (R12.2.5), Prepare phase failed with error ORA-31011: XML parsing failed

Complete error message from adop logfile
==================================

[aptest12@localhost <RUN> ]$ adop phase=prepare

Enter the APPS password:
Enter the SYSTEM password:
Enter the WLSADMIN password:

Validating credentials.

Initializing.
    Run Edition context  : /test12/applmgr/fs2/inst/apps/test12_localhost/appl/admin/TEST12_localhost.xml
    Patch edition context: /test12/applmgr/fs1/inst/apps/test12_localhost/appl/admin/TEST12_localhost.xml
    Patch file system free space: 1130.34 GB

Validating system setup.
    [ERROR]     Failed to execute SQL statement:
select node_name from FND_OAM_CONTEXT_FILES
where NAME not in ('TEMPLATE','METADATA','config.txt') and
CTX_TYPE='A' and (status is null or upper(status) in ('S','F'))
and EXTRACTVALUE(XMLType(TEXT),'//file_edition_type') = 'run'
and EXTRACTVALUE(XMLType(TEXT),'//oa_service_group_status[@oa_var=''s_web_admin_status'']')='enabled'
and EXTRACTVALUE(XMLType(TEXT),'//oa_service_list/oa_service[@type=''admin_server'']/oa_service_status')='enabled'

    [ERROR]     Error Message:
    [ERROR]     ORA-31011: XML parsing failed
    [ERROR]     ORA-19202: Error occurred in XML processing
    [ERROR]     LPX-00229: input source is empty
    [ERROR]     ORA-06512: at "SYS.XMLTYPE", line 272
    [ERROR]     ORA-06512: at line 1 (DBD ERROR: error possibly near <*> indicator at char 208 in '    select node_name from FND_OAM_CONTEXT_FILES
    [ERROR]         where NAME not in ('TEMPLATE','METADATA','config.txt') and
    [ERROR]             CTX_TYPE='A' and (status is null or upper(status) in ('S','F'))
    [ERROR]             and EXTRACTVALUE(<*>XMLType(TEXT),'//file_edition_type') = 'run'
    [ERROR]             and EXTRACTVALUE(XMLType(TEXT),'//oa_service_group_status[@oa_var=''s_web_admin_status'']')='enabled'
    [ERROR]             and EXTRACTVALUE(XMLType(TEXT),'//oa_service_list/oa_service[@type=''admin_server'']/oa_service_status')='enabled'
    [ERROR]     ')
    [UNEXPECTED]Error occurred identifying Admin server node


[STATEMENT] Please run adopscanlog utility, using the command

"adopscanlog -latest=yes"

to get the list of the log files along with snippet of the error message corresponding to each log file.


Cause:

fnd_oam_context_files table has incorrect information  or the table is corrupted


Solution:


1. Login to sqlplus as applsys and take backup of fnd_oam_context_files table.


SQL> conn applsys
Enter password:
Connected.
SQL> create table fnd_oam_context_files_bkp as select * from fnd_oam_context_files;

Table created.

2. Truncate Table fnd_oam_context_files

SQL>  truncate table fnd_oam_context_files;

Table truncated.


3. Bring down MT Services

4. Run Autoconfig to populate fnd_oam_context_files  with right information.


Sunday 12 June 2016

Adpatch Worker Fails with "ORA-01031: insufficient privileges"

While Applying Patch on Customer environment, adpatch failed with error

Worker log has the below error.
=====================
GRANT select on DBA_USERS_WITH_DEFPWD to em_oam_monitor_role

AD Worker error:
The following ORACLE error:

ORA-01031: insufficient privileges


occurred while executing the SQL statement:

GRANT select on DBA_USERS_WITH_DEFPWD to em_oam_monitor_role

Error occurred in file

/test12/applmgr/1200/ad/12.0.0/patch/115/sql/ademusr.sql

with arguments '&systempwd &un_fnd &pw_fnd &un_apps &pw_apps'.

==============


Solution
=========

1.Login to oracle database server as oracle user

2. Connect to sqlplus as sysdba and run the failed sql statement
sqlplus / as sysdba

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> GRANT select on DBA_USERS_WITH_DEFPWD to em_oam_monitor_role
  2  ;

Grant succeeded.

SQL>


3. Use adctrl to skip the failed worker (Option 8) and continue with Patching

Weblogic Admin Server Startup Fails with "weblogic.security.SecurityInitializationException: Authentication denied: Boot identity not valid"

After Changing Weblogic Password in boot.properties, Admin Server Fails to Startup.
Below error is reported in Admin Server Logfile.

<May 10, 2017 10:10:18 AM CDT> <Critical> <Security> <BEA-090402> <Authentication denied: Boot identity not valid; The user name and/or password from the boot identity file (boot.properties) is not valid. The boot identity may have been changed since the boot identity file was created. Please edit and update the boot identity file with the proper values of username and password. The first time the updated boot identity file is used to start the server, these new values are encrypted.> 


Solution
=======

1.Shutdown Admin Server and Nodemanager
2.Go to $DOMAIN_HOME/servers/Admin Server>/data
3.Delete the boot.properties
4.Delete the whole ldap folder
5.Start the node-manager
6.Start Admin Server