Tuesday, July 14, 2009

Who says Full Table Scans are Evil?

Recently, I had a developer complaining that a query which normally runs very fast is now taking huge amount of time to complete.

Following is the culprit query and its execution plan:

SELECT  DISTINCT SUBSTR(A.MF_NO,4) MF_NO,
 B.SRT_SEQ,
 SUBSTR(B.MAIN_CODE,1,6) MAIN_CODE
  FROM  DETAIL_TBL A,
        MASTER_TBL B    
  WHERE A.MRF_ID   = B.MRF_ID    
    AND A.MRF_CODE = B.MRF_CODE    
    AND TRIM(SUBSTR(A.MF_NO,4)) IN (SELECT LPAD(MF_NO,4,0) 
                                     FROM REMOTE_TBL@RMLINK 
                                    WHERE USER_LEVEL = 'RO'  
                                      AND STATUS_CODE <> 'D');

Execution Plan
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                  | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                       |       |       |  6646 (100)|          |        |      |
|   1 |  HASH UNIQUE                 |                       |     1 |    62 |  6646   (2)| 00:01:34 |        |      |
|   2 |   TABLE ACCESS BY INDEX ROWID| MASTER_TBL            |     1 |    30 |     5   (0)| 00:00:01 |        |      |
|   3 |    NESTED LOOPS              |                       |     1 |    62 |  6645   (2)| 00:01:34 |        |      |
|*  4 |     HASH JOIN                |                       |     1 |    32 |  6640   (2)| 00:01:33 |        |      |
|   5 |      REMOTE                  | REMOTE_TBL            |     1 |    12 |    24   (0)| 00:00:01 |  REMDB | R->S |
|   6 |      TABLE ACCESS FULL       | DETAIL_TBL            |  4159K|    79M|  6589   (2)| 00:01:33 |        |      |
|*  7 |     INDEX RANGE SCAN         | SYS_C0011692          |     3 |       |     2   (0)| 00:00:01 |        |      |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access(TRIM(SUBSTR("A"."MF_NO",:SYS_B_3))=LPAD("MF_NO",:SYS_B_4,TO_CHAR(:SYS_B_5)))
   7 - access("A"."MRF_ID"="B"."MRF_ID" AND "A"."MRF_CODE"="B"."MRF_CODE")

Using AWR SQL report (awrsqrpt.sql) for the time when the query was performing better shows that a Full Table Scan was performed on the MASTER_TBL.

Execution Plan
----------------------------------------------------------
Plan hash value: 1931109479

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                       |  2919K|   172M|       | 42564   (2)| 00:09:56 |        |      |
|   1 |  HASH UNIQUE         |                       |  2919K|   172M|   424M| 42564   (2)| 00:09:56 |        |      |
|*  2 |   HASH JOIN          |                       |  2919K|   172M|    11M| 15147   (2)| 00:03:33 |        |      |
|   3 |    TABLE ACCESS FULL | MASTER_TBL            |   289K|  8494K|       |  5590   (2)| 00:01:19 |        |      |
|*  4 |    HASH JOIN         |                       |  1225K|    37M|       |  6833   (2)| 00:01:36 |        |      |
|   5 |     REMOTE           | REMOTE_TBL            |    29 |   348 |       |    24   (0)| 00:00:01 |  REMDB | R->S |
|   6 |     TABLE ACCESS FULL| DETAIL_TBL            |  4191K|    79M|       |  6781   (2)| 00:01:35 |        |      |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."MRF_ID"="B"."MRF_ID" AND "A"."MRF_CODE"="B"."MRF_CODE")
   4 - access(TRIM(SUBSTR("A"."MF_NO",4))=LPAD("MF_NO",4,'0'))

When we add a Full Table Scan (FTS) hint on the MASTER_TBL, the same query ran in less than 20 seconds. We can modify the query to include FTS hint as it is a home grown application.

SELECT  /*+ FULL(A) */ DISTINCT SUBSTR(A.MF_NO,4) MF_NO,
 B.SRT_SEQ,
 SUBSTR(B.MAIN_CODE,1,6) MAIN_CODE
  FROM  DETAIL_TBL A,
        MASTER_TBL B    
  WHERE A.MRF_ID   = B.MRF_ID    
    AND A.MRF_CODE = B.MRF_CODE    
    AND TRIM(SUBSTR(A.MF_NO,4)) IN (SELECT LPAD(MF_NO,4,0) 
                                     FROM REMOTE_TBL@RMLINK 
                                    WHERE USER_LEVEL = 'RO'  
                                      AND STATUS_CODE <> 'D');


Execution Plan
----------------------------------------------------------
Plan hash value: 1931109479

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                       |  2919K|   172M|       | 42564   (2)| 00:09:56 |        |      |
|   1 |  HASH UNIQUE         |                       |  2919K|   172M|   424M| 42564   (2)| 00:09:56 |        |      |
|*  2 |   HASH JOIN          |                       |  2919K|   172M|    11M| 15147   (2)| 00:03:33 |        |      |
|   3 |    TABLE ACCESS FULL | MASTER_TBL            |   289K|  8494K|       |  5590   (2)| 00:01:19 |        |      |
|*  4 |    HASH JOIN         |                       |  1225K|    37M|       |  6833   (2)| 00:01:36 |        |      |
|   5 |     REMOTE           | REMOTE_TBL            |    29 |   348 |       |    24   (0)| 00:00:01 |  REMDB | R->S |
|   6 |     TABLE ACCESS FULL| DETAIL_TBL            |  4191K|    79M|       |  6781   (2)| 00:01:35 |        |      |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."MRF_ID"="B"."MRF_ID" AND "A"."MRF_CODE"="B"."MRF_CODE")
   4 - access(TRIM(SUBSTR("A"."MF_NO",4))=LPAD("MF_NO",4,'0'))

This worked as an emergency painkiller before the routine lab tests are performed and an appropriate action is decided.

In the bad execution plan, to access the master table (MASTER_TBL), optimizer is doing an index range scan (“SYS_C0011692”) and the cardinality reported is “3”. This is where the real problem is hiding. Drilling into the tables, columns and indexes for both the tables revealed this information:

Table Name: MASTER_TBL
Primary Key: MRF_ID, MRF_CODE, ORDNO
Num Rows: 287,456

Table Name: CHILD_TBL
Primary Key: MRF_ID, MRF_CODE, ORDNO
Num Rows: 4,159,964

Oops… there are nearly 300,000 records in the MASTER_TBL, but the execution plan indicates a cardinality of merely 3. So, there is something’s serious wrong somewhere. First suspect is table and index statistics. As suspected, statistics on MASTER_TBL were botched up. These were the steps performed on the MASTER_TBL:

1) Data was cleaned 2) Initial data load was run 3) Statistics were collected 4) Remaining major chunk of the data loaded

After collecting fresh statistics on the MASTER_TBL things got back on track. Also, data belonging to columns “MRF_ID” & “MRF_CODE” is evenly spread across all the used blocks in MASTER_TBL & CHILD_TBL tables. Moreover, there is no limiting condition on these columns which could encourage optimizer to go with an index access path. So, it would make no sense for the optimizer to go with an index access path.