Showing posts with label 11g. Show all posts
Showing posts with label 11g. Show all posts

Sunday, August 06, 2017

What is Edition Based Redefinition (EBR) in Oracle?



Edition Based Redefinition (EBR) is a new feature introduced from Oracle Software Version 11g R2.Edition-based redefinition allows multiple versions of PL/SQL objects, views and synonyms in a single schema.This feature facilitates Online patching of database objects in R12.2 online Patching.

Editionable and Non-editionable Objects

Below schema object types are editionable in the database:

SYNONYM
VIEW
SQL translation profile

All PL/SQL object types:

FUNCTION
LIBRARY
PACKAGE and PACKAGE BODY
PROCEDURE
TRIGGER
TYPE and TYPE BODY

If a schema object type is editionable in the database, then it can be editionable in schemas.All other schema object types are noneditionable in the database and in every schema, and objects of that type are always noneditioned. Tables are always noneditioned objects.


To view the editions related information in the database, use the below datadictionary tables.

DBA_EDITIONS: SELECT * FROM dba_editions;

DBA_OBJECTS_AE: DBA_OBJECTS_AE view shows all objects, of all editions.

Tuesday, July 25, 2017

Oracle Recovery Manager Features: BACKUP AS COPY

What is the advantage backup as copy feature in RMAN?

RMAN(Recovery Manager) utility creates backup in the form of backupsets or backup pieces by default.RMAN utility also allows other method which is similar to user managed hot backup. It can be acheived using 'BACKUP AS COPY' command.Using "Backup as Copy" RMAN creates backup in the form of image copies. However, we do not need to put the database in begin backup/end backup mode as in Conventional Hot Backup.

Syntax for Datafile Backup using Backup as copy:

RMAN> backup as copy datafile 1 format '/backup/data/datafile01.dbf' ;



Another advantage of this feature is that it allows to create image copy of control file.

 RMAN> backup as copy current controlfile format 'F:\CntrlFile_Bkp.ctl';  


How do we identify backups taken  as Image copies using RMAN?

Use "LIST COPY" Command. "LIST BACKUP" only gives a list of backup sets which were created using default RMAN backup Methos.

Syntax:

RMAN> LIST COPY;

Tuesday, June 27, 2017

Oracle RMAN (Recovery Manager) Concepts

What is the difference between obsolete and expired Backup in RMAN?

Obsolete Backup: Backup pieces or backup sets which are not needed for recovery.
We can use the REPORT OBSOLETE command to list the obsolete backups and DELETE OBSOLETE command to delete the obsolete backup.

Expired Backup:When the CROSSCHECK command is used to determine whether backups recorded in the repository still exist on disk or tape, if RMAN cannot locate the backups, then it updates their records in the RMAN repository to EXPIRED status.We can then use the DELETE EXPIRED command to remove records of expired backups from the RMAN repository.

Oracle DBA Concepts: Background Processes in Oracle ASM (Automatic storage Management)

The following background processes are an integral part of Automatic Storage Management:

ARBn performs the actual rebalance data extent movements in an Automatic Storage Management instance. There can be many of these processes running at a time, named ARB0, ARB1, and so on.

ASMB runs in a database instance that is using an ASM disk group. ASMB communicates with the ASM instance, managing storage and providing statistics. ASMB can also run in the ASM instance. ASMB runs in ASM instances when the ASMCMD cp command runs or when the database instance first starts if the SPFILE is stored in ASM.

GMON maintains disk membership in ASM disk groups.

MARK marks ASM allocation units as stale following a missed write to an offline disk. This essentially tracks which extents require resync for offline disks.

RBAL runs in both database and ASM instances. In the database instance, it does a global open of ASM disks. In an ASM instance, it also coordinates rebalance activity for disk groups.

Sunday, June 25, 2017

How to change Rebalance Power in ASM Instances?


Whenever a new diskgroup is added or dropped, ASM automatically performs REBALANCING operation.Power used by rebalacing defaults to value specified for asm_power_limit initialization parameter.We can check the status of rebalancing from v$asm_operation view.

Increasing the value of ASM_POWER_LIMIT, reduces the estimated time for completion of Rebalance Operation.

ASM power limit can be increased using the below command

alter system set asm_power_limit =11;

To know more about values for this parameter, check this link:

http://www.appsdbadiaries.com/2017/06/asm-features-rebalancing.html


But this does not increase the rebalancing power of current operation. 

To increase the rebalancing power of ongoing operation, use the below command

 alter diskgroup dg1 rebalance power 11;

We can also specify rebalance power at the time of adding or dropping disks.

alter diskgroup dg1 add disk d01 rebalance power 11;


When to use asm_power_limit 0?

Whenever there is a need to add/drop several disks (like migrations), setting the power limit to 0 is highly beneficial.We can avoid the waiting time for rebalance operation to complete,before the next disk is rebalanced.In such cases, set the asm_power_limit to 0, add/drop all the disks and set the value of asm_power_limit to a non zero value.
Rebalance operation will be parallelized and completes faster.

Tuesday, June 06, 2017

Initialization Parameters Required for Oracle ASM (Automatic Storage Management)

The below parameters are required for ASM instance.

1. INSTANCE_TYPE 

INSTANCE_TYPE specifies whether the instance is a database instance or an Automatic Storage Management instance.

Values:

RDBMS

The instance is a database instance.

ASM

The instance is an Automatic Storage Management instance.


2. DB_UNIQUE_NAME 

DB_UNIQUE_NAME specifies a globally unique name for the database

Default value
Database instances: the value of DB_NAME
Automatic Storage Management instances: +ASM

3. ASM_POWER_LIMIT 

ASM_POWER_LIMIT specifies the maximum power on an Automatic Storage Management instance for disk rebalancing. The higher the limit, the faster rebalancing will complete. Lower values will take longer, but consume fewer processing and I/O resources.

Default value 1

Range of Values 0 to 11 (Prior to 11gR2)

Range of Values 0 to 1024(From 11gR2)



4. ASM_DISKGROUPS 

ASM_DISKGROUPS specifies a list of names of disk groups to be mounted by an Automatic Storage Management instance at instance startup or when an ALTER DISKGROUP ALL MOUNT statement is issued.

Automatic Storage Management (ASM) automatically adds a disk group to this parameter when the disk group is successfully created or mounted, and automatically removes a disk group from this parameter when the disk group is dropped or dismounted.

Issuing the ALTER DISKGROUP...ALL MOUNT or ALTER DISKGROUP...ALL DISMOUNT command does not affect the value of this parameter.

5.ASM_DISKSTRING 

ASM_DISKSTRING specifies an operating system-dependent value used by Automatic Storage Management to limit the set of disks considered for discovery. When a new disk is added to a disk group, each Automatic Storage Management instance that has the disk group mounted must be able to discover the new disk using the value of ASM_DISKSTRING.

6. ASM_PREFERRED_READ_FAILURE_GROUPS 

ASM_PREFERRED_READ_FAILURE_GROUPS specifies the failure groups that contain preferred read disks. Preferred disks are instance specific. This parameter is only valid in ASM instances.

What are Nodeapps Services in Oracle RAC


Nodeapps are standard set of oracle application services which are started automatically for RAC.

Node apps Include:

1) VIP.
2) Oracle Net listener.
3) Global Service Daemon.
4) Oracle Notification Service.

Nodeapp Services run on each node of the cluster and will switched over to other nodes through VIP during the failover.


Useful commands to maintain nodeapps services:

srvctl stop nodeapps -n NODE1 
[ STOP NODEAPPS  on NODE 1 ]
srvctl stop nodeapps -n NODE2            
[ STOP NODEAPPS  on NODE 2 ]

srvctl start nodeapps -n NODE1              
[ START NODEAPPS on NODE1  ]
srvctl start nodeapps -n NODE2             
[  START NODEAPPS ON NODE2 ]

srvctl status nodeapps                      
[Check the status of services on all nodes]

Friday, May 05, 2017

Oracle Database Administrator Interview Questions on RAC

Oracle RAC (Voting Disk ) FAQ's

1.What is Voting Disk?

Voting Disk is a file that maintains node membership details.
All members in the Cluster read and write Heartbeat Information in Voting Disk.

2. What Information is stored in Voting Disk?

Voting disks contain static and dynamic data.
Static data : Info about nodes in the cluster
Dynamic data : Disk heartbeat logging
It maintains and consists of important details about the cluster nodes membership, such as
- which node is part of the cluster,
- which node is joining the cluster, and
- which node is leaving the cluster.

3. Why do we need Voting Disk?

CSSD processes (Cluster Services Synchronization Daemon) monitor the health of  RAC nodes employing two distinct heart 

beats: Network heart beat and Disk heart beat. Healthy nodes will have continuous network and disk heartbeats exchanged 

between the  nodes. Break in heart beat indicates a possible error scenario. There are few different scenarios possible 

with missing heart beats:
1. Network heart beat is successful, but disk heart beat is missed.
2. Disk heart beat is successful, but network heart beat is missed.
3. Both heart beats failed.
In addition, with numerous nodes, there are other possible scenarios too. Few possible scenarios:
1. Nodes have split in to N sets of nodes, communicating within the set, but not with members in other set.
2. Just one node is unhealthy.
Nodes with quorum will maintain active membership of the cluster and other node(s) will be fenced/rebooted.

4.Why do we have odd number of voting disks?

Odd number of Voting disks are required to prevent Split Brain Syndrome.
A node must be able to access more than half of the voting disks in order to decide which node can be evicted incase of 

failure.

5. How to backup Voting Disk?

Prior to Oracle 11g R2 version, Voting Disk is backed up using dd command

From 11g R2, no need to manually backup voting disk.
It is automatically backup along with OCR  whenever there is a configuration change.

In 11g R2, Restoring Voting disk from a manually copied backupfile may prevent cluster services from starting up.

Oracle Database Administrator Interview Questions on RMAN

RMAN INTERVIEW QUESTIONS
========================

1.What is the difference between recovery catalog and Nocatalog backup in RMAN?

NoCatalog Backup:

ByDefault RMAN connects to the target database in Nocatalog Mode.
In Nocatalog Mode, Backup Information and Metadata related to RMAN is stored in target database controlfile.

Catalog Backup:

A recovery catalog is a schema created in a separate database that contains metadata obtained from the target control file.
In recovery catalog we can store rman scripts.

We can store metadata about multiple incarnations of a single target database in the catalog.
Recovery catalog is central and can have information of many databases.
If the control file is lost and must be restored from backup, the backup configuration information is available when the database is not mounted.

2.What are the differences between crosscheck and validate commands?

Validate command is to examine a backup set and report whether it can be restored. RMAN scans all of the backup pieces in the specified backup sets and looks at the checksum to verify that the contents are intact so that backup can be successfully restored if necessary.

Crosscheck command is to verify the status of backups and copies recorded in the RMAN repository against media such as disk or tape. The crosscheck command only processes files created on the same device type as the channel running crosscheck.

3.What is obsolete backup & expired backup?

A status of “expired” means that the backup piece or backup set is not found in the backup destination.

A status of “obsolete” means the backup piece is still available, but it is no longer needed. The backup piece is no longer needed since RMAN has been configured to no longer need this piece after so many days have elapsed, or so many backups have been performed.

4.What is the difference between hot backup & RMAN backup?

We need to put the database in Backup mode for Hot  Backup. RMAN Backup does not need the database to be in Backup mode.

5. Which Tables are have information required for RMAN list & report commands ?
 V$BACKUP_FILES and recovery catalog views e.g.,  RC_DATAFILE_COPY or RC_ARCHIVED_LOG.

Below tables contain RMAN Catalog information:

RC_DATABASE_INCARNATION 
RC_BACKUP_COPY_DETAILS
RC_BACKUP_CORRUPTION
RC_BACKUP-DATAFILE_SUMMARY




Friday, December 30, 2016

What is OCR (Oracle Cluster Registry)?

Oracle Cluster Registry

OCR is the central repository for the Oracle cluster.
OCR has information about Port information,node failures, node reconfiguration,Database Status,Listener Status,Instance status, Network Status in real time.


Default Location of ocr.loc file on linux --  /etc/oracle/ocr.loc

Oracle Clusterware reads ocr.loc to determine the registry information and application resources that need to started up on each cluster node.

The below processes update OCR with realtime information.

1. CRSd updates  OCR with  information about node failure 
2. CSSd updates the OCR when a node is added or deleted
3. NetCA, DBCA, SRVCTL updates  OCR with  services related information

OCR also has information about  Software active version

Software  version can also be found using below command
$ crsctl query crs activeversion



OCR Backup and Restore



OCR is a binary file and cannot be edited manually.By Default, OCR is backed up automatically every 4 hours (from Instance Startup time).
OCR backups are cached across all nodes of a cluster.However,Only Master RAC node can write to OCR.
OCR backups are overwritten automatically. Clusterware maintains last 3 backups of OCR. 

Backup location can be defined using the below command.

$ocrconfig -backuploc <backup location>

It is recommended to use shared location for storing OCR backups.

To view the current OCR backups available, use the below commands

$ocrconfig –showbackup auto 

$ocrconfig –showbackup manual


After any clusterware related changes, OCR backup can be taken manually.

$ocrconfig –manualbackup


If OCR file is Corrupted or lost, CRS Instance may crash.

To restore OCR from Backup,follow the procedure below


1.Using ocrconfig identify the latest OCR backup using command below

$ocrconfig -showbackup
2.Shutdown CRS Services on all RAC nodes
$crsctl stop crs
  
3.Start CRS on one node in exclusive mode
$crsctl start crs -excl 
Stop crsd , if it is running

$ crsctl stop resource ora.crsd -init

4.Restore the OCR 
$ocrconfig –restore <backuplocation/backupfilename>
   
5. Verify the integrity of OCR
$ocrcheck
6. Shutdown  CRS Services on the node where you had started in exclusive mode
$crsctl stop crs 


7.Start CRS on all RAC nodes 
$crsctl start crs
8. Verify the restored OCR 
$cluvfy comp ocr –n all -verbose

Monday, May 23, 2016

"lsnrctl: symbol lookup error: lsnrctl: undefined symbol: nlgh" while starting Oracle Database listener

After Applying PSU, database listener failed to start with below error

 $ lsnrctl start testdb
lsnrctl: symbol lookup error: lsnrctl: undefined symbol: nlgh


Solution
==========
1.Relink network executables using below command

$  make -kf ins_net_server.mk install
/bin/chmod 755 /testdb/oracle/product/11204/bin/
 - Linking tnslsnr
rm -f tnslsnr
gcc -o tnslsnr -m64 -z noexecstack -L/testdb/oracle/product/11204/network/lib/ -L/testdb/oracle/product/11204/lib/ -L/testdb/oracle/product/11204/lib/stubs/  /testdb/oracle/product/11204/network/lib/s0nsgl.o /testdb/oracle/product/11204/network/lib/snsglp.o -lclntsh  `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnro11 `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnnz11 -lzt11 -lztkg11 -lclient11 -lnnetd11  -lvsn11 -lcommon11 -lgeneric11 -lmm -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnro11 `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lclient11 -lnnetd11  -lvsn11 -lcommon11 -lgeneric11   -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 -lclient11 -lnnetd11  -lvsn11 -lcommon11 -lgeneric11 -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11   `cat /testdb/oracle/product/11204/lib/sysliblist` -Wl,-rpath,/testdb/oracle/product/11204/lib -lm    `cat /testdb/oracle/product/11204/lib/sysliblist` -ldl -lm   -L/testdb/oracle/product/11204/lib -lons -lnl11  -ln11 -lnlsnr11 -lnlsnrc11 -lnsgr11 -lnms -lncm11 -lnmsp -lpeer -lnro11 -ln11   -lnl11 -lnlsnr11 -lnlsnrc11 -lnms -lncm11 -lnmsp -ln11     -lm
mv -f /testdb/oracle/product/11204/bin/tnslsnr /testdb/oracle/product/11204/bin/tnslsnr0
mv tnslsnr /testdb/oracle/product/11204/bin/tnslsnr
/bin/chmod 751 /testdb/oracle/product/11204/bin/tnslsnr
 - Linking lsnrctl
rm -f lsnrctl
gcc -o lsnrctl -m64 -z noexecstack -L/testdb/oracle/product/11204/network/lib/ -L/testdb/oracle/product/11204/lib/ -L/testdb/oracle/product/11204/lib/stubs/  /testdb/oracle/product/11204/network/lib/s0nsglc.o -lclntsh  `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnro11 `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnnz11 -lzt11 -lztkg11 -lclient11 -lnnetd11  -lvsn11 -lcommon11 -lgeneric11 -lmm -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnro11 `cat /testdb/oracle/product/11204/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lclient11 -lnnetd11  -lvsn11 -lcommon11 -lgeneric11   -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 -lclient11 -lnnetd11  -lvsn11 -lcommon11 -lgeneric11 -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11   `cat /testdb/oracle/product/11204/lib/sysliblist` -Wl,-rpath,/testdb/oracle/product/11204/lib -lm    `cat /testdb/oracle/product/11204/lib/sysliblist` -ldl -lm   -L/testdb/oracle/product/11204/lib -lnl11  -ln11 -lnlsnr11 -lnlsnrc11 -lnsgr11 -lnms -lncm11 -lnmsp -lpeer -lnro11 -ln11   -lnl11 -lnlsnr11 -lnlsnrc11 -lnms -lncm11 -lnmsp -ln11     -lm
mv -f /testdb/oracle/product/11204/bin/lsnrctl /testdb/oracle/product/11204/bin/lsnrctl0
mv lsnrctl /testdb/oracle/product/11204/bin/lsnrctl
/bin/chmod 751 /testdb/oracle/product/11204/bin/lsnrctl


2. Restart Database Listener

$ lsnrctl start testdb

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 16-MAY-2016 02:00:20

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Starting /testdb/oracle/product/11204/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 11.2.0.4.0 - Production
System parameter file is /testdb/oracle/product/11204/network/admin/testdb/listener.ora
Log messages written to /testdb/log/network/testdb.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dblocalhost)(PORT=1523)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dblocalhost)(PORT=1523)))
STATUS of the LISTENER
------------------------
Alias                     testdb
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                16-MAY-2016 02:00:20
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /testdb/oracle/product/11204/network/admin/testdb/listener.ora
Listener Log File         /testdb/log/network/testdb.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dblocalhost)(PORT=1523)))
Services Summary...
Service "testdb" has 1 instance(s).
  Instance "testdb", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

Thursday, March 24, 2016

Oracle RMAN Inremental Backup Algorithm and Change Tracking File

During an incremental backup, RMAN reads the SCN of each data block in the input file and compares it to the checkpoint SCN
of the parent incremental backup. If the SCN in the input data block is greater than or equal to the checkpoint SCN of the
parent, then RMAN copies the block.


Performance of incremental Backup can be improved by enabling block change tracking.When Change tracking is enabled, all the changes to datablocks are recored in change tracking file. RMAN can read the changed SCN's from change tracking file, instead of reading each and every input file.
One change tracking file is created for the entire database.
By default, the change tracking file is created as an Oracle managed file in DB_CREATE_FILE_DEST. We can specify the name of the block change tracking file by providing the location you choose.


SQL command to enable block change tracking

SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING;
This creates change tracking file in "db_create_file_dest" directory


SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING  USING FILE '/test1/data/rman_change_track.f' REUSE;
This creates the change tracking file in the location specified.


SQL command to disable block change tracking

SQL> ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
By Default,  block change tracking is disabled.
If the change tracking feature is disabled, any existing change tracking files will be deleted


SQL query to find change tracking file Details

select filename, status from v$block_change_tracking


How to Relocate the change tracking file


1.Shut down the database

SHUTDOWN IMMEDIATE

2.Using OS commands, move the change tracking file to a new location.

3.Mount the database and move the change tracking file to a location that has more space.

SQL query to rename change tracking file

ALTER DATABASE RENAME FILE   '/test1/data/rman_change_track.f' TO '/newtest1/data1/change_trk.f';

4.Open the database:

ALTER DATABASE OPEN;


If we cannot afford downtime for database shutdown, change tracking file can be relocated as below


ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE 'new-location';

Contents of change tracking file will be lost if we use the above method.So its not a preferable method.
RMAN will have to scan the entire file until the next level 0 incremental backup is done

Advantages of RMAN Incremental Backups

What is RMAN Incremental Backup?

Incremental Backups take  back up  of only datafile blocks that have changed after a previous full backup. 
Incremental backups are applicable for databases, individual tablespaces or datafiles.


Benefits of RMAN Incremental Backup:

1.RMAN Incremental backups are used to periodically roll forward an image copy of the database

2.Reduced amount of time needed for Backups

3.Less Network bandwidth is required

4.To get adequate backup performance when the aggregate tape bandwidth available for tape write I/Os is much less than the aggregate disk bandwidth for disk read I/Os

5.To  recover changes to objects created with the NOLOGGING option. 

6.To reduce backup sizes for NOARCHIVELOG databases. Instead of making a whole database backup every time, you can make incremental backups.

Note:

If the database is in ARCHIVELOG mode,  incremental backups can be taken if the database is open;
If the database is in NOARCHIVELOG mode, then we can only make incremental backups after a consistent shutdown.


Click Here to know more about how incremental backup is implemented by RMAN and the imporatnace of block change tracking feature.

Thursday, March 17, 2016

How to Perform Clusterware Compatibility Testing in Oracle RAC

+ Compatiblity Testing can be done using Oracle Certification Environment (OCE) kit

Installing the Oracle Certification Environment Software for Oracle RAC

The OCE Certification Kit required to certify the system for Oracle RAC 11g Release 1 (11.1) is available for download only. The Single Instance certification tests should be completed prior to installing the OCE kit for Oracle RAC. Refer to the previous section if necessary. Once Single Instance testing has successfully completed, the single instance OCE installations must be archived to allow OCE installations for Oracle RAC to succeed. The OCE kit for Oracle RAC should be installed separately on each node of the cluster. If the ORACLE_HOME is located on a shared disk, multiple installations of OCE will not be possible. In that case, it will not be possible to run OCE tests simultaneously, and the time required to complete certification will be greatly increased. To install the OCE Kit:
1.      Download OCE for Oracle RAC 11g Release 1 (11.1) archive to a suitable location, such as /tmp/oce. The OCE archives are either CPIO archives, or compressed CPIO archives.
If compressed, extract as follows:
gunzip -c OCE ARCHIVE | cpio -idmv
If not compressed, extract as follows:
cpio -idmv < OCE_ARCHIVE
Where, OCE_ARCHIVE is the name of the archive.
2.      Enter the following command to run the OCE Installer:
3.  $ archive_location/oce_install.sh
The Environment variable screen is displayed.
4.      Enter values for each of the environment variables as described on screen.
Note: You must press the Enter key once to enter the new variable value, and then press the Enter key again to move onto the next variable.
5.      Type Done when finished and press the Enter key.
The installation progress screen is displayed. When all stages are complete, the installer will exit.
6.      Check $ORACLE_HOME/OCEinstallRAC.log file, and verify that there are no errors.
7.      Download the Binaries Package for OCE for Oracle RAC 11g Release 1 (11.1) which matches your platform, and extract the archive as explained in Step 1.
8.      Enter the following command to run the OCE binaries installation script.
9.  $ /tmp/oce/oce_exes_install.sh
10.  Check the OCE Kit installation log file, $ORACLE_HOME/OCE/install_log.txt, and verify that the installation was successful.
The kit is installed in the $ORACLE_HOME/oce directory.

Preparing the System for Multi-Node High Availability Services Testing
The Oracle RAC High Availability Services test suite must use shared storage for data files. Depending on the type of shared storage utilized, some preliminary setup may be required. Before running the Oracle RAC High Availability Services test suite, complete the following:
  • If you are using raw devices or logical volumes for shared storage perform the following steps:

Note:
In this example, the OCE user is oracle, which is a member of the dba group; there are 4 nodes in the cluster; and the OCE logical volumes are located in /dev/ocevg/ directory.

    1. Set up the devices or logical volumesrequired by the tests.
    2. Ensure raw devices are accessible and writable across all nodes by the OCE user
      1. # chown -R oracle:dba /dev/ocevg
      2. # chmod -R og+w /dev/ocevg
    3. Export $ORACLE_HOME/oce/work$ORACLE_HOME/dbs, and $ORACLE_HOME/network/admin from the node 1 to all other nodes in the cluster.
On node 1:
      1. # exportfs -i -o rw <node2>:$ORACLE_HOME/oce/work \
<node3>:$ORACLE_HOME/oce/work \
<node4>:$ORACLE_HOME/oce/work
      1. # exportfs -i -o rw <node2>:$ORACLE_HOME/dbs \
<node3>:$ORACLE_HOME/dbs \
<node4>:$ORACLE_HOME/dbs
    1. # exportfs -i -o rw <node2>:$ORACLE_HOME/network/admin \
<node3>:$ORACLE_HOME/network/admin \
<node4>:$ORACLE_HOME/network/admin
    1. $ORACLE_HOME/oce/work$ORACLE_HOME/network/admin, and $ORACLE_HOME/dbs must be mounted on all secondary nodes from the primary (exported) node.
On all nodes except node 1:
      1. # mkdir –p $ORACLE_HOME/oce/work
      2. # chown oracle:dba $ORACLE_HOME/oce/work
      3. # mount <node1>:$ORACLE_HOME/oce/work $ORACLE_HOME/oce/work
      4. # mount <node1>:$ORACLE_HOME/dbs $ORACLE_HOME/dbs
      5. # mount <node1>:$ORACLE_HOME/network/admin \
$ORACLE_HOME/network/admin
  • If you are using OCFS or NAS or a vendor clustered file system (CFS), and ORACLE_HOME directory is not located on shared partition, then perform the following steps:

Note:
If NAS, ensure that the appropriate mount options are employed when mounting the NAS partition. Oracle requires specific mount options. Consult your NAS Filer documentation for further details.

    1. Symbolically link $ORACLE_HOME/dbs to the OCFS/CFS/NAS partition on all nodes (in this example, the OCFS/CFS/NAS partition is at /sharedfs).
On node 1:
    1. mkdir /sharedfs/dbs
    2. chown oracle:dba /sharedfs
On all nodes:
      1. mv $ORACLE_HOME/dbs $ORACLE_HOME/dbs.BAK
      2. ln -s /sharedfs/dbs $ORACLE_HOME/dbs
    1. Export $ORACLE_HOME/oce/work and $ORACLE_HOME/network/admin from the primary node.
      1. # exportfs -i -o rw <node2>:$ORACLE_HOME/oce/work \
<node3>:$ORACLE_HOME/oce/work \
<node4>:$ORACLE_HOME/oce/work
    1. # exportfs -i -o rw <node2>:$ORACLE_HOME/network/admin \
<node3>:$ORACLE_HOME/network/admin \
<node4>:$ORACLE_HOME/network/admin
    1. $ORACLE_HOME/oce/work and $ORACLE_HOME/network/admin must be mounted on all secondary nodes from the primary (exported) node. Default mount options will suffice.
On all nodes except node 1:
      1. # mkdir –p $ORACLE_HOME/oce/work
      2. # chown oracle:dba $ORACLE_HOME/oce/work
      3. # mount <node1>:$ORACLE_HOME/oce/work $ORACLE_HOME/oce/work
      4. # mount <node1>:$ORACLE_HOME/dbs $ORACLE_HOME/dbs
      5. # mount <node1>:$ORACLE_HOME/network/admin \
$ORACLE_HOME/network/admin
  • If you are using OCFS or CFS accessing a shared Oracle home directory, no setup is required.
  • Ensure that no databases are running.

Starting Test Manager
To start Test Manager:
  1. Ensure that the DISPLAY environment variable is set appropriately for your system. To verify that it is, try starting up xclock. If you do not see the clock, or you receive errors, DISPLAY is not set appropriately. You must correct any errors before proceeding.
  2. Enter the following command to launch OCE Test Manager:
3.  $ORACLE_HOME/oce/bin/startTM.sh > /tmp/OCETM.log 2>&1
The OCE Main Menu and OCE Test Manager windows appear.
Running a Test for the First Time
If this is your first time running certification tests, you must perform the following steps:
  1. Start Test Manager as described in Starting Test Manager.
  2. From the OCE - Main Menu window, double click Utilities.
  3. Run the bmchk test by selecting it and clicking Execute.
  4. When the test completes, click Results in the Test Manager window to check the outcome. If the test fails, you must analyze the output ($INST_HOME/work/bmchk) and resolve any issues. Do not proceed with testing until bmchk executes successfully.
  5. Run sdbck (the Seed Database Verification utility) test by selecting it and clicking Execute.
  6. When the test completes, click Results in the Test Manager window to check the outcome. If the test fails, you must analyze the output ($INST_HOME/work/sdbck) and resolve any issues. Do not proceed with testing until sdbck executes successfully.
  7. Run cssck (the CSS Daemon Verification utility) test by selecting it and clicking Execute.
  8. When the test completes, click Results in the Test Manager window to check the outcome. If the test fails, you must analyze the output ($INST_HOME/work/cssck) and resolve any issues. Do not proceed with testing until cssck executes successfully.

Running the OCE Test Suites

The OCE release consists of a set of test suites that you run from Test Manager. Each test suite consists of one or more individual tests. To complete the certification, run each of the Test Suites in the kit for the product for which you are certifying your system.
1.      From the OCE - Main Menu screen, double-click Complete test suites.
2.      From the screen that appears, select the test suite you want to run and click Execute to run it.
The test suite runs.
Test Manager creates two entries for the test suite in the Test Manager window; one in the Suite Name field and another in the History field:
o    The entry in the Current Test field is displayed only for the duration of a test. It displays the time at which you requested the test, and if it starts, when it started. Test Manager might display some tests with a status of Waiting until resources become available on the system.
o    The entry in the History field displays the time you requested it.
When a test finishes, Test Manager deletes its entry in the Current Tests field and adds another entry to the History field showing when the test finished.






Below are the Test plans for Oracle Clusterware Compatibility (Destructive) Testing
(Category : ORACLE HIGH AVAILABILITY FEATURES)
Clusterware Test Category
[Test Code]
Action  Target
Detailed Test Execution
Expected Test Outcome
Actual Test Outcome
[D]
Oracle HA Features


[HW-CW-09]

Run multiple cluvfy operations during Oracle Clusterware and RAC install  All RAC hosts

Configuration:
GNS:
Gns with dhcp (1)
Gns without dhcp (2)
Without gns (3)
Preferred option 1, if not applicable option   2,  if still not applicable option 3

ASM:
Flex asm (1)
Standard asm (2)
Preferred option 1, if not applicable option 2

DB:
CDB
Preconditions:
·          Type `cluvfy` to see all available command syntax and options

Steps:
1- Run cluvfy precondition
2- Do the next install step
3- Run cluvfy post-condition
(cluvfy comp software –n node_list) to check the file permissions
No need to collect CRS/RDBMS log for this test.  You need to submit the output for cluvfy.


Vendor Clusterware:
- same as RAC

RAC:
-           Correct cluster verification checks given the state of the cluster hardware and software

Pls provide cvu related logs under
$CRS_HOME/cv/log


[HW-CW-10]

Run concurrent crsctl start/stop crs commands to stop or start Oracle Clusterware in planned mode  All RAC hosts

Configuration:
GNS:
Gns with dhcp (1)
Gns without dhcp (2)
Without gns (3)
Preferred option 1, if not applicable option 2,  if still not applicable option 3

ASM:
Flex asm (1)
Standard asm (2)
Preferred option 1, if not applicable option 2

DB:
CDB


Preconditions:
·          Initiate all Workloads
·          Identify both CSS and CRS master nodes
·          Type `crsctl` as root to see all available command syntax and options

Steps:
1- As root user, run `crsctl stop crs` command concurrently on more than one RAC host, to stop the resident Oracle Clusterware stack
2- Wait until the target Oracle Clusterware stack is fully stopped (via `ps` command)
3- As root user, run `crsctl start crs
    -wait` command concurrently on more than one RAC host, to start the resident Oracle Clusterware stack


Vendor Clusterware:
- N/A

RAC:
Stop:  All Oracle Clusterware daemons stop without leaving open ports or zombie processes
Start:  All Oracle Clusterware daemons start without error messages in stdout or any of the CRS, CSS or EVM traces
Start:  All registered HA resource states match the “target” states, as per 
“crsctl stat res –t”

For 12cR1, collect
“crsctl stat res –t” in a 60s loop from beginning till the end of run.  Attach the output for auditing.

[HW-CW-11]

Run other concurrent crsctl commands, such as crsctl check crs,   All RAC hosts

Configuration:
GNS:
Gns with dhcp (1)
Gns without dhcp (2)
Without gns (3)
Preferred option 1, if not applicable option   2,  if still not applicable option 3

ASM:
Flex asm (1)
Standard asm (2)
Preferred option 1, if not applicable option 2

DB:
CDB
Preconditions:
·          Initiate all Workloads
·          Identify both CSS and CRS master nodes
·          Type `crsctl` as root to see all available command syntax and options

Steps:
1-        As root user, run any `crsctl check crs` commands concurrently on all nodes
2-        As root user, run any `crsctl check cluster -all` commands concurrently on all nodes


Vendor Clusterware:
- same as RAC

RAC:
-           Both `crsctl check crs` and `crsctl check cluster -all` commands produce the appropriate, useful output, without any error messages
-           Collect output for step 1 and step 2

[HW-CW-12]

Votedisk and OCR operation

Configuration:
GNS:
Gns with dhcp (1)
Gns without dhcp (2)
Without gns (3)
Preferred option 1, if not applicable option 2,  if still not applicable option 3

ASM:
Flex asm (1)
Standard asm (2)
Preferred option 1, if not applicable option 2

DB:
CDB
Preconditions:
·          Make sure votedisk on ASM diskgroup
·          Make sure ASM OCR files are used
·          Make sure at least one normal redundancy ASM Diskgroup with three failgroups is created and its “compatible.asm” attribute is set to “11.2”;

Steps:
1-        Make sure crs stack are running in all nodes.

2-        Run “crsctl query css votedisk” to check configured VFs;
3-        Run “crsctl replace votedisk +{ASM_DG_NAME}”(As crs user or root user);
4-        Run “crsctl query css votedisk” to get the new VF list;
5-        Run “ocrconfig –add +{ASM_DGNAME}” as root user;
6-        Run “ocrcheck” to verify the OCR files;
7-        Restart CRS stack and then verify the VF/OCR after it comes back;

Variants:
1. Add up to 5 OCR files and restart CRS stack;


RAC:
-           In 12cR1, we can support up to 5 OCRs;

-            





[HW-CW-13]
crsctl command to manage Oracle clusterware stack

Configuration:
GNS:
Gns with dhcp (1)
Gns without dhcp (2)
Without gns (3)
Preferred option 1, if not applicable option 2,  if still not applicable option 3

ASM:
Flex asm (1)
Standard asm (2)
Preferred option 1, if not applicable option 2

DB:
CDB
Preconditions:
·          CRS stack is up and running on all nodes.

Steps:
1-        Run ‘crsctl check cluster –all’ to get the stack status on all cluster nodes. Make sure stack status of all cluster nodes are correct;
2-        Run ‘crsctl stop cluster –all’ to stop all CRS resource (CSSD/CRSD/EVMD) with application resources;
3-        Run ‘crsctl status cluster –all’ to make sure CRS resource are OFFLINE;
4-        Run ‘crsctl start cluster –all’ to bring back the whole cluster stack
RAC:
-           After running “crsctl stop cluster –all”, make sure all ocssd/evmd/crsd processes are stopped on all cluster nodes by “ps –ef”.
For 12cR1, collect
“crsctl stat res –t” in a 60s loop from beginning till the end of run.  Attach the output for auditing.


[HW-CW-14]

OCR stores in ASM’s diskgroup and kill asm fatal process

Configuration:
GNS:
Gns with dhcp (1)
Gns without dhcp (2)
Without gns (3)
Preferred option 1, if not applicable option   2,  if still not applicable option 3

ASM:
Flex asm (1)
Standard asm (2)
Preferred option 1, if not applicable option 2

DB:
CDB
Preconditions:
·          Initiate Workloads

Steps:
·          Make sure only ASM OCR files   are used by “ocrcheck –config”;

·          Kill the ASM pmon process on   the CRSD PE Master node;

Variants:
   Repeat the same test on non-OCR Master node.
Clusterware:
Because OCR is stored in ASM, if ASM fails or is brought down on crsd pe master, CRSD pe master will exit and  select a new crsd pe master

- ASM, CRSD will be automatically restarted.

-RDBMS instance should connect to other available asm instance in flex asm env 

- After CRSD restart, all resources’ state shouldn’t change

- New crsd pe master node should be the old crsd pe standby master A new crsd pe standby master should be   elected on other nodes.

(CRSD should recover resources’ previous state)

For 12cR1, collect
“crsctl stat res –t” in a 60s loop from beginning till the end of run.  Attach the output for auditing.



Collect Logfiles
Run each destructive test, taking note of the test start time, test stop time and fault injection time.  On the surviving node (if applicable), run the “date; crsctl stat res –t; sleep 60” in a loop

At the end of the test run, please collect the following logs and put them in directory <CRSHome>/log with the name format as [log_name]_[hostname] and then tar up and compress  with file name <VendorName>_<TestCode>.tar.gz.(e.g. WidgetCorp_HW-STOR-07.tar.gz):
·         Under <CRSHome>/log/<hostname>, the following logs are required
o   alert[hostname].log
o   crsd/crsd.log
o   cssd/ocssd.log
o   evmd/evmd.log
o   ohasd/ohasd.log
o   gpnpd/gpnpd.log
o   diskmon/diskmon.log
o   mdnsd/mdnsd.log
o   ctssd/ctssd.log
o   agent/*
o   gipcd/gipcd.log  (11.2.0.2 new feature)
o   cvu/cvulog/*.log (11.2.0.2 new feature)
o   cvu/cvutrc/*     (11.2.0.2 new feature)
o   srvm/*
o   admin/*
o   acfs/*
o   crfmond/*
o   crflogd/*
o   racg/*
o   gnsd/* (if gns configured)