Monday, May 26, 2008

Database is very slow!!!

Dear Readers,

I received a call from one of our developers and following was the conversation that took place between us:

He: “Database is very slow”

Me: What is very slow? Can you tell me what actually you are doing?

He: I am running a simple report which is supposed to return less than 100 records. But it’s been more than 4 hours and the query is still running.

Me: How long it use to take before?

He: This is the first time we are running this query.

Me: Ok, let me log in to the database.



I logged into the database and ran my set of commands to trace the culprit SQL. I was able to identify the query and here it is:

SELECT
      TO_CHAR(TRN_DATE, 'YYYY-MM-DD HH24:MI:SS') TRN_DATE,
      TRAN_TYPE,
      TRAN_REF,
      KEY_NO,
      IN_AMT A1,
      0 A2
 FROM TRANSACTION_DAILY
WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' )
      AND SRC_FLAG ='L'
      AND (SUBSTR(OUT_KEY_NO,1,10) = 'DXIY0-19XV' or SUBSTR(IN_KEY_NO,1,10) = 'DXIY0-19XV')
      AND TRAN_TYPE  NVL(KEY_NO,'A')  NVL(IN_AMT,0) NOT IN (
                    SELECT  TD_TRAN_TYPE  TD_KEY_NO_REF  NVL(OUT_AMT,0)
                      FROM TRANSACTION_DETAILS
                     WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) 
                       AND RM_CODE = 'XXX123'  )
UNION
SELECT
      TO_CHAR(TRN_DATE, 'YYYY-MM-DD HH24:MI:SS') TRN_DATE,
      TD_TRAN_TYPE,
      TRAN_REF, 
      TD_KEY_NO_REF, 
      0 A1,
      OUT_AMT A2
 FROM TRANSACTION_DETAILS
WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' )
  AND RM_CODE = 'XXX123' 
  AND TD_TRAN_TYPE  TD_KEY_NO_REF  NVL(OUT_AMT,0)   NOT IN (
                    SELECT TRAN_TYPE  NVL(KEY_NO,'A')  NVL(IN_AMT,0)
                      FROM TRANSACTION_DAILY
                     WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' )
                       AND SRC_FLAG ='L'
                       AND (SUBSTR(OUT_KEY_NO,1,10) = 'DXIY0-19XV' 
                            or SUBSTR(IN_KEY_NO,1,10) = 'DXIY0-19XV') )

and this is the execution plan:

-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                       |       |       |    51M(100)|          |       |       |
|   1 |  SORT UNIQUE                          |                       |  2842 |   159K|    51M (70)|198:47:09 |       |       |
|   2 |   UNION-ALL                           |                       |       |       |            |          |       |       |
|*  3 |    FILTER                             |                       |       |       |            |          |       |       |
|   4 |     PARTITION RANGE SINGLE            |                       |  2042 |   135K|  2854   (2)| 00:00:40 |   KEY |   KEY |
|*  5 |      TABLE ACCESS FULL                | TRANSACTION_DAILY     |  2042 |   135K|  2854   (2)| 00:00:40 |   KEY |   KEY |
|   6 |     PARTITION RANGE SINGLE            |                       |   760 | 19000 | 15269   (1)| 00:03:34 |   KEY |   KEY |
|*  7 |      TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION_DETAILS   |   760 | 19000 | 15269   (1)| 00:03:34 |   KEY |   KEY |
|*  8 |       INDEX RANGE SCAN                | TRAN_DET_IDX          |   434K|       |   980   (1)| 00:00:14 |   KEY |   KEY |
|*  9 |    FILTER                             |                       |       |       |            |          |       |       |
|  10 |     PARTITION RANGE SINGLE            |                       |   800 | 24800 | 15269   (1)| 00:03:34 |   KEY |   KEY |
|* 11 |      TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION_DETAILS   |   800 | 24800 | 15269   (1)| 00:03:34 |   KEY |   KEY |
|* 12 |       INDEX RANGE SCAN                | TRAN_DET_IDX          |   434K|       |   980   (1)| 00:00:14 |   KEY |   KEY |
|* 13 |     TABLE ACCESS BY GLOBAL INDEX ROWID| TRANSACTION_DAILY     |  1940 |   111K| 88735   (1)| 00:20:43 | ROW L | ROW L |
|* 14 |      INDEX RANGE SCAN                 | PK_TRANSACTION_DAILY  |   102K|       |   413   (1)| 00:00:06 |       |       |
-------------------------------------------------------------------------------------------------------------------------------


From the above execution plan, Optimizer thinks it needs “198:47:09” hours to “Sort Unique” the result set. But the actual time the query took to complete was 6 hours and 49 minute (24540 seconds).

I then simplified the query in order to understand it clearly by removing some of the predicates. The simplified query is:

 
SELECT *
 FROM TRANSACTION_DAILY
WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) 
      AND TRAN_TYPE || NVL(KEY_NO,'A') || NVL(IN_AMT,0) NOT IN (
                    SELECT  TD_TRAN_TYPE || 
                            TD_KEY_NO_REF || 
                            NVL(OUT_AMT,0) 
                      FROM TRANSACTION_DETAILS
                     WHERE TRN_DATE = 
                           TO_DATE ( '#BUSDATE#', 'DDMMYYYY' )  )
UNION
SELECT *
 FROM TRANSACTION_DETAILS
WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) 
  AND TD_TRAN_TYPE || TD_KEY_NO_REF || NVL(OUT_AMT,0)   NOT IN (
                    SELECT TRAN_TYPE || 
                           NVL(KEY_NO,'A') || 
                           NVL(IN_AMT,0)
                      FROM TRANSACTION_DAILY
                     WHERE TRN_DATE = 
                            TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) )

Hmmmm, this query seems to be doing a full outer join on “TRANSACTION_DAILY” and “TRANSACTION_DETAILS” tables. I called him back:

Me: What is the objective of this report?

He: We need to find all the information from “TRANSACTION_DAILY” table which does not exist in “TRANSACTION_DETAILS” table and vice versa.

Yes, I was right. It’s doing a “Full Outer Join”. I modified his original query and proposed him a new query which is executing at lighting speed and more importantly yielding the same result:

 
SELECT  nvl(a.TRN_DATE, b.TRN_DATE) TRN_DATE,
        a.TRN_TYPE, b.OTH_TRN_TYPE,
        a.KEY_NO, b.KEY_NO_REF,
        nvl(a.IN_AMT, 0) A1, 
        nvl(b.OUT_AMT, 0) A2
FROM  (select TRN_DATE, TRN_TYPE, KEY_NO, IN_AMT, 
         from TRANSACTION_DAILY 
        where TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY')
          and (substr(IN_KEY_NO, 1, 10) = 'DXIY0-19XV' OR
               substr(OUT_KEY_NO, 1, 10) = 'DXIY0-19XV')) a 
  FULL OUTER JOIN 
      (select TRN_DATE, OTH_TRN_TYPE, KEY_NO_REF, OUT_AMT, 
         from TRANSACTION_DETAILS 
        where RM_CODE = 'XXX123'
          and TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY')) b
  ON (a.TRN_DATE = b.TRN_DATE and 
      a.TRN_TYPE = b.OTH_TRN_TYPE and 
      a.KEY_NO = b.KEY_NO_REF and 
      a.IN_AMT = b.OUT_AMT and 
      a.rno = b.rno)
WHERE (a.TRN_TYPE is null or 
       b.OTH_TRN_TYPE is null or 
       a.KEY_NO is null or 
       b.KEY_NO_REF is null or 
       a.IN_AMT is null or 
       b.OUT_AMT is null);

I called him back with over-excitement announcing him that the query is ready and let us test it:

Me: I have re-written the query, kindly come over to my office so that we can test it together.

He: How long does it take to execute?

Me: It completes in less than 4 seconds.

He: What...? … less than 4 seconds … ? … Are you sure, have you included all the condition that I have mentioned?

Me: Yes, let us test it.

He made couple of tests and the query was perfect to all the test cases but except one. Requirements started getting little messy so, I decided to create two tables and test the query against them. I created two tables, A and B, of same table structure with sample data:

 
create table a(id number, amt number);
insert into a values (1, 10);
insert into a values (2, 20);
insert into a values (3, 30);
insert into a values (4, 40);

create table b(id number, amt number);
insert into b values (2, 20);
insert into b values (3, 30);
insert into b values (5, 50);
insert into b values (6, 60);

commit;

Basically, we need to find out data that exists in A and is not in B and also records that exist in B which are not A. Here’s the query:

 
SQL> select *
  2    from a FULL OUTER JOIN b
  3      on a.id = b.id
  4   where a.id is null or b.id is null;

        ID        AMT         ID        AMT
---------- ---------- ---------- ----------
         4         40
                               6         60
                               5         50

SQL>

He said: “Yes, the query is fetching right data, but what happens when you have two records with same id and amt in table A and only a single record in B? In this case, the query should display one record from table A.”

Adding more to the complexity he said: “There is no third column in either table to distinguish this occurrence.”

So, we inserted a record in each table and ran the same query:

 
insert into a values (1, 10);
insert into b values (1, 10);


SQL> select *
  2    from a FULL OUTER JOIN b
  3      on a.id = b.id
  4   where a.id is null or b.id is null;

        ID        AMT         ID        AMT
---------- ---------- ---------- ----------
         4         40
                               6         60
                               5         50

SQL>

Oops!! The query fails at this point. Ok, I then decided to re-write this query and this time use analytical functions to rescue me from the current problematic situation.

 
SQL> select x.id, x.amt, y.id, y.amt
  2    from (select id, amt, 
  3                 row_number() over (partition by id, amt 
  4                                    order by id, amt) rno 
  5            from a) x
  6         FULL OUTER JOIN
  7         (select id, amt, 
  8                 row_number() over (partition by id, amt 
  9                                     order by id, amt) rno 
 10                 from b) y
 11      on x.id = y.id and x.rno = y.rno
 12   where x.id is null 
 13      or y.id is null 
 14      or x.rno is null 
 15      or y.rno is null;

        ID        AMT         ID        AMT
---------- ---------- ---------- ----------
         1         10
         4         40
                               5         50
                               6         60

SQL>

Yippy!!! This query rocks. I then quickly transformed the original query into this form and ran it again:

 
SELECT  nvl(a.TRN_DATE, b.TRN_DATE) TRN_DATE,
        a.TRN_TYPE, b.OTH_TRN_TYPE,
        a.KEY_NO, b.KEY_NO_REF,
        nvl(a.IN_AMT, 0) A1, 
        nvl(b.OUT_AMT, 0) A2
FROM  (select TRN_DATE, TRN_TYPE, KEY_NO, IN_AMT, 
              row_number() over (partition by KEY_NO, 
                                              TRN_TYPE, 
                                              IN_AMT 
                                     order by KEY_NO, 
                                              TRN_TYPE, 
                                              IN_AMT) rno 
         from TRANSACTION_DAILY 
        where TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY')
          and (substr(IN_KEY_NO, 1, 10) = 'DXIY0-19XV' OR
               substr(OUT_KEY_NO, 1, 10) = 'DXIY0-19XV')) a 
  FULL OUTER JOIN 
      (select TRN_DATE, OTH_TRN_TYPE, KEY_NO_REF, OUT_AMT, 
              row_number() over (partition by KEY_NO_REF, 
                                              OTH_TRN_TYPE, 
                                              OUT_AMT 
                                     order by KEY_NO_REF, 
                                              OTH_TRN_TYPE, 
                                              OUT_AMT) rno    
         from TRANSACTION_DETAILS 
        where RM_CODE = 'XXX123'
          and TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY')) b
  ON (a.TRN_DATE = b.TRN_DATE and 
      a.TRN_TYPE = b.OTH_TRN_TYPE and 
      a.KEY_NO = b.KEY_NO_REF and
      a.IN_AMT = b.OUT_AMT and 
      a.rno = b.rno)
WHERE (a.TRN_TYPE is null or 
       b.OTH_TRN_TYPE is null or 
       a.KEY_NO is null or 
       b.KEY_NO_REF is null or 
       a.IN_AMT is null or 
       b.OUT_AMT is null or
       a.rno is null or
       b.rno is null);

Execution plan of this query is:

 
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                       |  2230 |   409K| 48722   (1)| 00:11:23 |       |       |
|   1 |  VIEW                                    |                       |  2230 |   409K| 48722   (1)| 00:11:23 |       |       |
|   2 |   UNION-ALL                              |                       |       |       |            |          |       |       |
|*  3 |    FILTER                                |                       |       |       |            |          |       |       |
|*  4 |     HASH JOIN RIGHT OUTER                |                       |  1552 |   175K| 24361   (1)| 00:05:42 |       |       |
|   5 |      VIEW                                |                       |   678 | 40002 | 14852   (1)| 00:03:28 |       |       |
|   6 |       WINDOW SORT                        |                       |   678 | 12882 | 14852   (1)| 00:03:28 |       |       |
|   7 |        PARTITION RANGE SINGLE            |                       |   678 | 12882 | 14851   (1)| 00:03:28 |    14 |    14 |
|*  8 |         TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION_DETAILS   |   678 | 12882 | 14851   (1)| 00:03:28 |    14 |    14 |
|*  9 |          INDEX RANGE SCAN                | TRAN_DET_IDX          |   422K|       |   897   (1)| 00:00:13 |    14 |    14 |
|  10 |      VIEW                                |                       |  1552 | 88464 |  9508   (1)| 00:02:14 |       |       |
|  11 |       WINDOW SORT                        |                       |  1552 | 69840 |  9508   (1)| 00:02:14 |       |       |
|  12 |        PARTITION RANGE SINGLE            |                       |  1552 | 69840 |  9507   (1)| 00:02:14 |     7 |     7 |
|* 13 |         TABLE ACCESS FULL                | TRANSACTION_DAILY     |  1552 | 69840 |  9507   (1)| 00:02:14 |     7 |     7 |
|* 14 |    HASH JOIN ANTI                        |                       |   678 | 78648 | 24361   (1)| 00:05:42 |       |       |
|  15 |     VIEW                                 |                       |   678 | 40002 | 14852   (1)| 00:03:28 |       |       |
|  16 |      WINDOW SORT                         |                       |   678 | 12882 | 14852   (1)| 00:03:28 |       |       |
|  17 |       PARTITION RANGE SINGLE             |                       |   678 | 12882 | 14851   (1)| 00:03:28 |    14 |    14 |
|* 18 |        TABLE ACCESS BY LOCAL INDEX ROWID | TRANSACTION_DETAILS   |   678 | 12882 | 14851   (1)| 00:03:28 |    14 |    14 |
|* 19 |         INDEX RANGE SCAN                 | TRAN_DET_IDX          |   422K|       |   897   (1)| 00:00:13 |    14 |    14 |
|  20 |     VIEW                                 |                       |  1552 | 88464 |  9508   (1)| 00:02:14 |       |       |
|  21 |      WINDOW SORT                         |                       |  1552 | 69840 |  9508   (1)| 00:02:14 |       |       |
|  22 |       PARTITION RANGE SINGLE             |                       |  1552 | 69840 |  9507   (1)| 00:02:14 |     7 |     7 |
|* 23 |        TABLE ACCESS FULL                 | TRANSACTION_DAILY     |  1552 | 69840 |  9507   (1)| 00:02:14 |     7 |     7 |
----------------------------------------------------------------------------------------------------------------------------------

The results were convincing and it still takes 4 seconds to fetch the data compared to 6 hours and 49 minutes. It is 6135 times faster than the original one.

This query left End-Users rejoicing and he left my desk smiling but without answering my question, “Is the database very slow?”

P.S.: I have renamed all the table names and columns names so as not to reveal official and sensitive information on my personal blog, yet preserving the reality.

Regards

Friday, May 23, 2008

Automatic statistics gathering during Index Creation and Rebuilds

Dear all,

While testing dynamic sampling in Oracle 10g, I came to learn a 10g new feature (very late though).

My test case for dynamic sampling goes like this:

(a) Create table with data (b) Create an index, and (c) Execute the query and note the execution plan
SQL> Create table t as select * from scott.emp;

Table created.

SQL> Create index t_idx on t(empno);

Index created.

SQL> set autotrace traceonly explain
SQL> Select * from t where empno = 7788;

Execution Plan
----------------------------------------------------------
Plan hash value: 1020776977

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    87 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    87 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMPNO"=7788)

Note
-----
   - dynamic sampling used for this statement

SQL> Set autotrace off

Yes, the optimizer did dynamic sampling and picked up the index plan as the optimal execution plan. Now, let me delete the statistics and re-run the same SQL.

SQL> Exec dbms_stats.delete_table_stats(user, 't');

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly explain
SQL> Select * from t where empno = 7788;

Execution Plan
----------------------------------------------------------
Plan hash value: 2153619298

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    87 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    87 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("EMPNO"=7788)

Note
-----
   - dynamic sampling used for this statement

SQL> Set autotrace off

Oops, optimizer did consider dynamic sampling but this time chose a Full Table Scan. How’ come the same optimizer picked index access plan on the first run?

Ok, let me do it again and this time being very careful.

(a) Create table with data
SQL> Drop table t purge;

Table dropped.

SQL> Create table t as select * from scott.emp;

Table created.

SQL> Select num_rows, last_analyzed from user_tables where table_name = 'T';

  NUM_ROWS LAST_ANALYZED
---------- --------------


SQL>

(b) Create an index, and

SQL> Create index t_idx on t(empno);

Index created.

SQL> column index_name format a10
SQL> set line 10000

SQL> select blevel, leaf_blocks, distinct_keys, clustering_factor, 
  2  num_rows, last_analyzed, user_stats, global_stats
  3  from user_indexes
  4  where index_name = 'T_IDX';

    BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS LAST_ANALYZED        USE GLO
---------- ----------- ------------- ----------------- ---------- -------------------- --- ---
         0           1            14                 1         14 05-May-2008 10:14:48 NO  NO

SQL> 

Oh! I did not gather statistics.

(c) Execute the query and note the execution plan

SQL> set autotrace traceonly explain
SQL> Select * from t where empno = 7788;

Execution Plan
----------------------------------------------------------
Plan hash value: 1020776977

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    87 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    87 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMPNO"=7788)

Note
-----
   - dynamic sampling used for this statement

SQL> Set autotrace off

Now it’s clear why optimizer finds Index Access Path to be best plan. Let’s continue and delete the statistics and then execute the query again:

SQL> Exec dbms_stats.delete_table_stats(user, 't');

PL/SQL procedure successfully completed.

SQL> select blevel, leaf_blocks, distinct_keys, clustering_factor, 
  2  num_rows, last_analyzed, user_stats, global_stats
  3  from user_indexes
  4  where index_name = 'T_IDX';

    BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS LAST_ANALYZED        USE GLO
---------- ----------- ------------- ----------------- ---------- -------------------- --- ---
                                                                                       NO  NO

SQL> set autotrace traceonly explain
SQL> Select * from t where empno = 7788;

Execution Plan
----------------------------------------------------------
Plan hash value: 2153619298

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    87 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    87 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("EMPNO"=7788)

Note
-----
   - dynamic sampling used for this statement

SQL>  Set autotrace off

This time the optimizer did dynamic sampling and FTS seems to be less expensive.

In Oracle 10g, when an index is created, Oracle automatically gathers statistics on the newly created index. This is also true in case of index rebuild. Because of this reason, optimizer picked Index Path to be the best plan.

References:

Oracle® Database SQL Reference 10g Release 2 (10.2)

Regards

Monday, May 12, 2008

ORA-27061: waiting for async I/Os failed

Hi

One of our Physical Standby database aborted with "ORA-27061: waiting for async I/Os failed" error. The instance was killed by the DBWR process. Apparently, it was found to be an OS bug (IBM - AIX5L Based Systems) and an OS patch seems to resolve this issue.

Our current OS Patch level is AIX5.3 TL5-CSP and we need to upgrade it to either AIX5.3 TL6 SP4 or AIX5.3 TL7 to avoid this problem occurring in the future.

References:

Metalink Note Id: 467613.1 addresses this issue.

Happy reading :-)

Monday, May 05, 2008

Becareful when using DBMS_UTILITY to analyze

Hello,

If you are still using DBMS_UTILITY.ANALYZE_DATABASE or DBMS_UTILITY.ANALYZE_SCHEMA to analyze your database/schema's then you need to be very cautious.

I have observed a very strange behavior of this procedure against partitioned tables in one of our databases. Statistics are not being updated at table level. Although, partitions statistics are up to date.

The reason for incorrect statistics is:

If you use DBMS_STATS package to gather table statistics on a partitioned table and then later you use DBMS_UTILITY.ANALYZE_SCHEMA, table-level statistics are NOT updated, rather, statistics on partitions and indexes are modified.

This peculiar behavior is observed only with partitioned tables.

ANALYZE_SCHEMA procedure is obsolete and any one of us using this to gather statistics should seriously think of moving to DBMS_STATS package.

Following is a simple demo:

SQL> CREATE TABLE part_tab
  2    (id  NUMBER(5),
  3     dt    DATE)
  4     PARTITION BY RANGE(dt)
  5     (
  6     PARTITION part1_jan2008 VALUES LESS THAN(TO_DATE('01/02/2008','DD/MM/YYYY')),
  7     PARTITION part2_feb2008 VALUES LESS THAN(TO_DATE('01/03/2008','DD/MM/YYYY')),
  8     PARTITION part3_mar2008 VALUES LESS THAN(TO_DATE('01/04/2008','DD/MM/YYYY')),
  9     PARTITION part4_apr2008 VALUES LESS THAN(TO_DATE('01/05/2008','DD/MM/YYYY'))
 10    );

Table created.

SQL>
SQL>
SQL> create table non_part_tab (id number, dt date);

Table created.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB
PART_TAB

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008
PART1_JAN2008
PART3_MAR2008
PART4_APR2008

SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB                            0 05-05-2008 00:05:43
PART_TAB                                0 05-05-2008 00:05:43

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008                           0 05-05-2008 00:05:43
PART1_JAN2008                           0 05-05-2008 00:05:43
PART3_MAR2008                           0 05-05-2008 00:05:43
PART4_APR2008                           0 05-05-2008 00:05:43

SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB                            0 05-05-2008 00:05:59
PART_TAB                                0 05-05-2008 00:05:59

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008                           0 05-05-2008 00:05:59
PART1_JAN2008                           0 05-05-2008 00:05:59
PART3_MAR2008                           0 05-05-2008 00:05:59
PART4_APR2008                           0 05-05-2008 00:05:59

SQL> exec dbms_stats.gather_schema_stats('TEST');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB                            0 05-05-2008 00:06:11
PART_TAB                                0 05-05-2008 00:06:11

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008                           0 05-05-2008 00:06:11
PART1_JAN2008                           0 05-05-2008 00:06:11
PART3_MAR2008                           0 05-05-2008 00:06:11
PART4_APR2008                           0 05-05-2008 00:06:11

SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB                            0 05-05-2008 00:06:23
PART_TAB                                0 05-05-2008 00:06:11 ---> Statistics are NOT updated.

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008                           0 05-05-2008 00:06:23
PART1_JAN2008                           0 05-05-2008 00:06:23
PART3_MAR2008                           0 05-05-2008 00:06:23
PART4_APR2008                           0 05-05-2008 00:06:23

SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB                            0 05-05-2008 00:08:53
PART_TAB                                0 05-05-2008 00:06:11 ---> Statistics are NOT updated.

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008                           0 05-05-2008 00:08:53
PART1_JAN2008                           0 05-05-2008 00:08:53
PART3_MAR2008                           0 05-05-2008 00:08:53
PART4_APR2008                           0 05-05-2008 00:08:53

SQL> exec dbms_stats.gather_schema_stats('TEST');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, last_analyzed from user_tables ;

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
NON_PART_TAB                            0 05-05-2008 00:09:05
PART_TAB                                0 05-05-2008 00:09:05

SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions;

PARTITION_NAME                   NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -------------------
PART2_FEB2008                           0 05-05-2008 00:09:05
PART1_JAN2008                           0 05-05-2008 00:09:05
PART3_MAR2008                           0 05-05-2008 00:09:05
PART4_APR2008                           0 05-05-2008 00:09:05

SQL>

Useful references:

1)How to Move from ANALYZE (using DBMS_UTILITY) to DBMS_STATS (Note: 237397.1)

2) Oracle® Database PL/SQL Packages and Types Reference 10g Release 2 (10.2)

Happy reading !!!

Thursday, May 01, 2008

Small Change and A Huge Gain

Hi,

Elapsed time of one of our data warehouse procedure was 17 minutes on an average. Following is the skeleton procedure:

Create or replace procedure Update_Customers Is
  Cursor Cust_Cur Is
         Select * From Stg_Customers;

  l_Cust_Value Customers.Cust_Value;
Begin
  For Cust_Rec In Cust_Cur Loop
    :
    :
    Select Value Into l_Cust_Value
      From Customers
     Where Cust_Code = Cust_Rec.Cust_Code
       And Cust_Key = Cust_Rec.Cust_Key;
    :
    :
  End Loop;
End Update_Customers;

Upon taking snaps before and after executing the procedure it was evident that most of the time was consumed by the "SELECT...FROM Customers...." statement.

Elapsed Time from AWR report:

Procedure: 1021 Seconds

SQL Statement: 925 Seconds

There is a Composite Index on "Cust_Code" and "Cust_Key" columns. When I ran the same statement in SQL*Plus, it was fetching results very fast using the appropriate index.

STG_CUSTOMERS is a staging table which consists of nearly 250,000 records. Data is daily purged and populated in this table. CUSTOMERS table was probed for 250,000 times in the loop, individual query execution was very fast but repeated executions within the loop were causing the query to consume more time.

I replaced the original cursor by joining CUSTOMERS and STG_CUSTOMERS tables as shown below:

Create or replace procedure Update_Customers Is
  Cursor Cust_Cur Is
         Select * From Stg_Customers A, Customers b
           Where a.cust_code = b.cust_code
              And a.cust_key =  b.cust_key;

  l_Cust_Value Customers.Cust_Value;
Begin
  For Cust_Rec In Cust_Cur Loop
    :
    :
    :
  End Loop;
End Update_Customers;

When this modified procedure was executed, the performance was remarkably improved and the elapsed time dropped to only less than 75 seconds.

Below is the elapsed time of the same procedure before and after modification:

TYPE            DATE           Elapsed(Min)        CPU(Min)
--------------- ----------- --------------- ---------------
Procedure       19-Apr-2008           17.70            1.87
SQL Statement   19-Apr-2008           15.98           13.15
Procedure       20-Apr-2008           17.67            1.80
SQL Statement   20-Apr-2008           16.05           12.95
Procedure       21-Apr-2008           16.93            1.82
SQL Statement   21-Apr-2008           15.35           12.85
Procedure       22-Apr-2008           16.68            1.78
SQL Statement   22-Apr-2008           15.08           12.42
Procedure       23-Apr-2008           16.38            1.78
SQL Statement   23-Apr-2008           14.77           12.43
Procedure       24-Apr-2008            1.15             .92
Procedure       25-Apr-2008            1.13             .92
Procedure       26-Apr-2008            1.20             .93
Procedure       27-Apr-2008            1.23             .93

Regards