Sunday, August 14, 2011

Resolving “library cache pin” Waits

The old traditional export (exp) utility is hanging indefinitely at “exporting system procedural objects and actions” stage of the export.

 
[oracle@node1 ~]$ /orabackup/exp/prtdb/exp_prtdb.sh

Export: Release 11.2.0.1.0 - Production on Sun Aug 14 21:39:03 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.


Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit                                          Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,                                          OLAP,
Data Mining and Real Application Tes
Export done in WE8ISO8859P1 character set and AL16UTF16 NCHAR character set
server uses WE8MSWIN1252 character set (possible charset conversion)

About to export the entire database ...
. exporting tablespace definitions
. exporting profiles
. exporting user definitions
. exporting roles
. exporting resource costs
. exporting rollback segment definitions
. exporting database links
. exporting sequence numbers
. exporting directory aliases
. exporting context namespaces
. exporting foreign function library names
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions
. exporting system procedural objects and actions

Following are the details from V$SESSION view of this session:

SID:				190
SERIAL#:			379
USERNAME:			EXPIMP
STATUS:				ACTIVE
PROGRAM:			exp@node1 (TNS V1-V3)
LAST_CALL_ET:			1363
BLOCKING SESSION STATUS:	NOT IN WAIT
EVENT#:				280
EVENT:				library cache pin
WAIT CLASS:			Concurrency
STATE:				WAITED SHORT TIME
BLOCKING SESSION:						
  

As you can see the session is waiting on “library cache pin” wait event. This means there is at least one other session holding a lock on some stored procedure which “exp” is trying to pin. The “BLOCKSING SESSION” column does not report which session is holding a lock (or for that matter blocking).

I used “oradebug” utility to analyze the situation. Following is an excerpt from the log file generated by “oradebug”:

    Oracle session identified by:
    {
                instance: 1 (prtdb.prtdb1)
                   os id: 24885
              process id: 49, oracle@node1
              session id: 13
        session serial #: 839
    }
    is waiting for 'Streams AQ: waiting for messages in the queue' with wait info:

Below is the information about session “13” from V$SESSION view:

SID:				13
SERIAL#:			841
USERNAME:			DBSNMP
STATUS:				ACTIVE
PROGRAM:			emagent@node1 (TNS V1-V3)
LAST_CALL_ET:			5
BLOCKING SESSION STATUS:	UNKNOWN
EVENT#:				364
EVENT:			Streams AQ: waiting for messages in the queue
WAIT CLASS:		Idle
STATE:			WAITING
BLOCKING SESSION:						

The culprit session belongs to “DBSNMP” user and it is waiting on an “Idle” wait event.

It was easy to kill this session as it did not belong to any application. After killing the session (13), export happily started working and things were all normal.

. exporting pre-schema procedural objects and actions
. exporting cluster definitions
. about to export SYSTEM's tables via Conventional Path ...
. . exporting table                    DEF$_AQCALL
:
:
:

References:

Note:215858.1 - Interpreting HANGANALYZE trace files to diagnose hanging and performance problems

Note: 175006.1 Steps to generate HANGANALYZE trace files

Note: 402983.1 Master Note: How to diagnose Database Performance

Diagnosing Library Cache Latch Contention: A Real Case Study By Arup Nanda

Friday, August 12, 2011

Confused SRVCTL in Oracle 11g R2

OS: RHEL 5.5 64-bit

DB: Oracle Database 11gR2

Grid: Oracle Clusterware 11gR2

A database “adverten” is running on a Linux (64-bit) box. This database happens to be a two-node Oracle RAC database. The instance names of this database are “adverten1” and ““adverten2” respectively on the two nodes.

I ran the RACcheck tool but unfortunately it does not list this database. However other databases running on this server are listed.

Below is the excerpt from “raccheck” tool:

List of running databases registered in OCR

1. clonedb

2. frisdb

3. All

4. None

Believe me; the instance is there and running.

 
[oracle@node1 raccheck]$ ps -ef|grep pmon
grid      8585     1  0 Aug06 ?        00:00:17 asm_pmon_+ASM1
oracle   12351  8299  0 22:16 pts/7    00:00:00 grep pmon
oracle   21959     1  0 Aug09 ?        00:00:31 ora_pmon_clonedb1
oracle   24222     1  0 Aug09 ?        00:00:19 ora_pmon_adverten1
oracle   29006     1  0 Aug07 ?        00:00:32 ora_pmon_frisdb1
[oracle@node1 raccheck]$

It’s time to debug “raccheck” tool.

“is_rdbms_installed_crs ()” procedure of the “raccheck” script is of our interest for now and below are the statements from where it is actually fetching database names.

It grabs the list of databases using the following:

       $CRS/bin/srvctl config database > $db_list_fil

And checks for database status using the following:

 
crs_db_status=$($CRS/bin/srvctl status database -d $db_list|grep -i $loc	alnode|grep -ic "is running")

Okay, running the first command manually lists my database:

 
[grid@node1 ~]$ $CRS_HOME/bin/srvctl config database
advertence
clonedb
frisdb
[grid@node1 ~]$

If you notice, the database name listed above is “advertence” (10 letter word). The database name from V$DATABASE view is “adjudica” (8 letter word), also instances use “adverten”.

 
SQL> select name from v$database;

NAME
---------
ADJUDICA

SQL>

However, the second command could not locate the database instances:

[grid@node1 ~]$ $CRS_HOME/bin/srvctl status database -d advertence
Instance adverten1 is not running on node node1
Instance adverten2 is not running on node node2
[grid@node1 ~]$

Of course “adverten” is not registered resource. So the following will fail:

 
[grid@abis19 ~]$ $CRS_HOME/bin/srvctl status database -d adverten
PRCD-1120 : The resource for database adverten could not be found.
PRCR-1001 : Resource ora. adverten.db does not exist
[grid@abis19 ~]$

This is the reason “raccheck” is not able to list this database.

What might have happened is the question that needs to be answered.

At the time of database creation, “advertence” was keyed in as the database name. But Oracle created a database by trimming it to 8 characters and reserving 1 character for instance number. But for some reason “advertence” database resource was registered.

Apparently, the “ora. advertence.db” resource registered with the cluster is in “OFFLINE” state.

 
[grid@node1 ~]$ crs_stat  ora. advertence.db
NAME=ora. advertence.db
TYPE=ora.database.type
TARGET=OFFLINE
STATE=OFFLINE

[grid@node1 ~]$

It’s all messy here, but things are working.

I will have to create a Service Request with Oracle Support to resolve this issue for this client of mine.