The Momen Blog

We have two 24x7 production databases with different charactersets. Yes, both charactersets are related to Arabic language. Database A’s characterset is AR8MSWIN1256 while database B’s characterset is AR8ISO8859P6.

We had a requirement wherein database A will read Arabic data from database B. If you create a database link and try to query data from database B it will appear to be garbage, as shown below:

whereas when you run the same query on database B, this is how the output looks:

Even you are unaware of Arabic language, you may visually compare and see the output is a garbage when queried from database A.

DUMP function comes handy under these situations. It returns internal representation of an expression. When used to against the “name” failed, I get this output.

SQL> select dump(name) from test_iso@testdb;

DUMP(NAME)
--------------------------------------------------------------------
Typ=1 Len=13: 227,205,227,207,32,194,213,221,32,227,196,227,228

where, “Typ” refers to datatype, “Len” means the length of the string, and everything after “:” is comma separated ASCII values of all the characters in the “name” field.

I wrote a small procedure where convert all the comma separated ASCII values to character string.

create or replace function read_name (p_id in number) return varchar2 is
  t_str varchar2(2000);
  l_name varchar2(2000);
  l_output varchar2(2000);
begin
  select 'select '||'chr('||replace(substr(dump(name, 1010), instr(dump(name, 1010), ':')+2), ',', ')|| chr(')||') from dual' 
    into t_str from test_iso@testdb
   where id = p_id;
 
--  dbms_output.put_line(t_str);
  execute immediate t_str into l_output;  
  return l_output;
end;
/

Now upon using this function, I am able to query remote table in a readable format.

Here’s the output:

Dear Readers,

Analytic Functions were first introduced in Oracle 8i, way back in 1999. Tom Kyte’s book “Expert One-on-One Oracle” has a dedicated chapter on this topic. Not only Tom’s book, any book on SQL (I have seen so far) has a separate chapter written on Analytic Functions. Yet, the developer community seems to be paying least attention in trying to understand and use them. I think before educating students, tutors should be trained to emphasize the importance of Analytic Functions in day to day life of a developer.

Anyways, here’s a similar case on one of our very busy OLTP database.

This SQL statement topped in the AWR report consuming nearly 84% of database time and was executed more than 1200 times during a 45-minutes AWR report.

A query is required to fetch TECH_PAC_ID for a customer along with the number of records for that customer.

Here’s the original query, its Explain Plan and Statistics:

SQL> set autotrace on
SQL> SELECT Tech_PAC_ID, Users_USER_CD, TOT_PAC 
  2   FROM (SELECT Tech_PAC_ID, Users_USER_CD 
  3          FROM Tech, Users 
  4         WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
  5           AND Users_USER_CD = Tech_ETI_USER_CD) DET, 
  6        (SELECT COUNT(*) TOT_PAC 
  7           FROM Tech, Users 
  8          WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' ) 
  9            AND Users_USER_CD = Tech_ETI_USER_CD) TOT;

TECH_PAC_N USERS_USER_CD        TOT_PAC
---------- -------------------- ----------
236XXX123  ABX65842                     5

Elapsed: 00:00:00.06

Execution Plan
----------------------------------------------------------
Plan hash value: 3349818188

------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |              |     1 |    54 |  1280   (1)| 00:00:16 |
|   1 |  MERGE JOIN CARTESIAN           |              |     1 |    54 |  1280   (1)| 00:00:16 |
|*  2 |   TABLE ACCESS BY INDEX ROWID   | TECH         |     1 |    18 |     4   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                 |              |     1 |    41 |   640   (1)| 00:00:08 |
|*  4 |     TABLE ACCESS FULL           | USERS        |     1 |    23 |   636   (1)| 00:00:08 |
|*  5 |     INDEX RANGE SCAN            | TECH_USER_CD |     3 |       |     1   (0)| 00:00:01 |
|   6 |   BUFFER SORT                   |              |     1 |    13 |  1276   (1)| 00:00:16 |
|   7 |    VIEW                         |              |     1 |    13 |   640   (1)| 00:00:08 |
|   8 |     SORT AGGREGATE              |              |     1 |   104 |            |          |
|*  9 |      TABLE ACCESS BY INDEX ROWID| TECH         |     1 |    12 |     4   (0)| 00:00:01 |
|  10 |       NESTED LOOPS              |              |     1 |   104 |   640   (1)| 00:00:08 |
|* 11 |        TABLE ACCESS FULL        | USERS        |     1 |    92 |   636   (1)| 00:00:08 |
|* 12 |        INDEX RANGE SCAN         | TECH_USER_CD |     3 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("TECH_STS"='Y')
   4 - filter("USERS_USER_CD"='ABX65842')
   5 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
       filter("TECH_ETI_USER_CD" IS NOT NULL)
   9 - filter("TECH_STS"='Y')
  11 - filter("USERS_USER_CD"='ABX65842')
  12 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
       filter("TECH_ETI_USER_CD" IS NOT NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       4671  consistent gets
          0  physical reads
          0  redo size
        558  bytes sent via SQL*Net to client
        381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

Instead of writing two queries to fetch TECH_PAC_ID and COUNT(USERS_USER_CD) we can achieve the same result by using Analytic Functions. The new query not only returns the required result but is also less resource intensive and more database-friendly.

Here goes the enhanced query using Analytic Function:

SQL> SELECT Tech_PAC_ID, Users_USER_CD, 
  2         count(1) over (partition by Users_USER_CD) TOT_PAC
  3          FROM Tech, Users 
  4         WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
  5           AND Users_USER_CD = Tech_ETI_USER_CD
  6  GROUP BY Tech_PAC_ID, Users_USER_CD;

TECH_PAC_N USERS_USER_CD        TOT_PAC
---------- -------------------- ----------
236XXX123  ABX65842                     5

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 1328229640

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |     1 |    41 |   641   (1)| 00:00:08 |
|   1 |  WINDOW BUFFER                |              |     1 |    41 |   641   (1)| 00:00:08 |
|   2 |   SORT GROUP BY               |              |     1 |    41 |   641   (1)| 00:00:08 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| TECH         |     1 |    18 |     4   (0)| 00:00:01 |
|   4 |     NESTED LOOPS              |              |     1 |    41 |   640   (1)| 00:00:08 |
|*  5 |      TABLE ACCESS FULL        | USERS        |     1 |    23 |   636   (1)| 00:00:08 |
|*  6 |      INDEX RANGE SCAN         | TECH_USER_CD |     3 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("TECH_STS"='Y')
   5 - filter("USERS_USER_CD"='ABX65842')
   6 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
       filter("TECH_ETI_USER_CD" IS NOT NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2335  consistent gets
          0  physical reads
          0  redo size
        558  bytes sent via SQL*Net to client
        381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL>

The new query seems to be doing a good job. The “consistent gets” have dropped from 4671 to 2335. That’s nearly half of the original, the reason being: instead of hitting the table twice, we are getting the work done in one hit. But still, the consistent gets seems to be reasonably high.

Although ROWS=1 is being shown in the Explain Plan for USERS table, but optimizer is spending most of its time doing a Full Table Scan at this step. Adding an index on USERS_USER_CD column of USERS table should do the trick.

After adding the index on USERS (USERS_USER_CD) column, the query seems to be flying.

SQL> SELECT Tech_PAC_ID, Users_USER_CD, 
  2         count(1) over (partition by Users_USER_CD) TOT_PAC
  3          FROM Tech, Users 
  4         WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
  5           AND Users_USER_CD = Tech_ETI_USER_CD
  6  GROUP BY Tech_PAC_ID, Users_USER_CD;

TECH_PAC_N USERS_USER_CD        TOT_PAC
---------- -------------------- ----------
236XXX123  ABX65842                     5


Execution Plan
----------------------------------------------------------
Plan hash value: 1753916289

---------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name                  | Rows  | Bytes | Cost (%CPU)| Time  |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                       |     1 |    33 |     7  (15)| 00:00:01 |
|   1 |  WINDOW BUFFER                  |                       |     1 |    33 |     7  (15)| 00:00:01 |
|   2 |   SORT GROUP BY                 |                       |     1 |    33 |     7  (15)| 00:00:01 |
|*  3 |    TABLE ACCESS BY INDEX ROWID  | TECH                  |     1 |    15 |     4   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                |                       |     1 |    33 |     6   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID| USERS                 |     1 |    18 |     2   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN          | USERS_MUB_USER_CD_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN           | TECH_USER_CD          |     3 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("TECH_STS"='Y')
   6 - filter("USERS_USER_CD"='ABX65842')
   7 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
       filter("TECH_ETI_USER_CD" IS NOT NULL)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          1  physical reads
          0  redo size
        558  bytes sent via SQL*Net to client
        399  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

This time merely 6 “consistent gets” were required. It’s a 99%+ reduction in overall consistent gets.

As I mentioned earlier in the post, I see SQL statements similar to this one over and over again. Analytic function is a nice alternative to the traditional way of writing these types of queries that works extremely well and provides the performance needed for high numbers of executions in a high data volume environment.

More information on Analytic Functions can (should) be obtained from here:

Oracle Database SQL Reference 10g Release 2

On Top-n and Pagination Queries by Tom Kyte

AskTom

One Analytic Function Can do More Than a 1000 Lines of Code by Alex Nuitjen

Happy reading.

The Momen Blog

Tuesday, October 14, 2008

How to read Data between different Characterset Databases

Sunday, September 21, 2008

Analytic Functions: The most ignored ones

Saturday, September 20, 2008

Syntax Highlighter

Total Pageviews