Dear Readers,
Analytic Functions were first introduced in Oracle 8i, way back in 1999. Tom Kyte’s book “Expert One-on-One Oracle” has a dedicated chapter on this topic. Not only Tom’s book, any book on SQL (I have seen so far) has a separate chapter written on Analytic Functions. Yet, the developer community seems to be paying least attention in trying to understand and use them. I think before educating students, tutors should be trained to emphasize the importance of Analytic Functions in day to day life of a developer.
Anyways, here’s a similar case on one of our very busy OLTP database.
This SQL statement topped in the AWR report consuming nearly 84% of database time and was executed more than 1200 times during a 45-minutes AWR report.
A query is required to fetch TECH_PAC_ID for a customer along with the number of records for that customer.
Here’s the original query, its Explain Plan and Statistics:
SQL> set autotrace on
SQL> SELECT Tech_PAC_ID, Users_USER_CD, TOT_PAC
2 FROM (SELECT Tech_PAC_ID, Users_USER_CD
3 FROM Tech, Users
4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
5 AND Users_USER_CD = Tech_ETI_USER_CD) DET,
6 (SELECT COUNT(*) TOT_PAC
7 FROM Tech, Users
8 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
9 AND Users_USER_CD = Tech_ETI_USER_CD) TOT;
TECH_PAC_N USERS_USER_CD TOT_PAC
---------- -------------------- ----------
236XXX123 ABX65842 5
Elapsed: 00:00:00.06
Execution Plan
----------------------------------------------------------
Plan hash value: 3349818188
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 54 | 1280 (1)| 00:00:16 |
| 1 | MERGE JOIN CARTESIAN | | 1 | 54 | 1280 (1)| 00:00:16 |
|* 2 | TABLE ACCESS BY INDEX ROWID | TECH | 1 | 18 | 4 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 41 | 640 (1)| 00:00:08 |
|* 4 | TABLE ACCESS FULL | USERS | 1 | 23 | 636 (1)| 00:00:08 |
|* 5 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
| 6 | BUFFER SORT | | 1 | 13 | 1276 (1)| 00:00:16 |
| 7 | VIEW | | 1 | 13 | 640 (1)| 00:00:08 |
| 8 | SORT AGGREGATE | | 1 | 104 | | |
|* 9 | TABLE ACCESS BY INDEX ROWID| TECH | 1 | 12 | 4 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | 1 | 104 | 640 (1)| 00:00:08 |
|* 11 | TABLE ACCESS FULL | USERS | 1 | 92 | 636 (1)| 00:00:08 |
|* 12 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TECH_STS"='Y')
4 - filter("USERS_USER_CD"='ABX65842')
5 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
9 - filter("TECH_STS"='Y')
11 - filter("USERS_USER_CD"='ABX65842')
12 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4671 consistent gets
0 physical reads
0 redo size
558 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
Instead of writing two queries to fetch TECH_PAC_ID and COUNT(USERS_USER_CD) we can achieve the same result by using Analytic Functions. The new query not only returns the required result but is also less resource intensive and more database-friendly.
Here goes the enhanced query using Analytic Function:
SQL> SELECT Tech_PAC_ID, Users_USER_CD,
2 count(1) over (partition by Users_USER_CD) TOT_PAC
3 FROM Tech, Users
4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
5 AND Users_USER_CD = Tech_ETI_USER_CD
6 GROUP BY Tech_PAC_ID, Users_USER_CD;
TECH_PAC_N USERS_USER_CD TOT_PAC
---------- -------------------- ----------
236XXX123 ABX65842 5
Elapsed: 00:00:00.04
Execution Plan
----------------------------------------------------------
Plan hash value: 1328229640
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 41 | 641 (1)| 00:00:08 |
| 1 | WINDOW BUFFER | | 1 | 41 | 641 (1)| 00:00:08 |
| 2 | SORT GROUP BY | | 1 | 41 | 641 (1)| 00:00:08 |
|* 3 | TABLE ACCESS BY INDEX ROWID| TECH | 1 | 18 | 4 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 41 | 640 (1)| 00:00:08 |
|* 5 | TABLE ACCESS FULL | USERS | 1 | 23 | 636 (1)| 00:00:08 |
|* 6 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TECH_STS"='Y')
5 - filter("USERS_USER_CD"='ABX65842')
6 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2335 consistent gets
0 physical reads
0 redo size
558 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL>
The new query seems to be doing a good job. The “consistent gets” have dropped from 4671 to 2335. That’s nearly half of the original, the reason being: instead of hitting the table twice, we are getting the work done in one hit. But still, the consistent gets seems to be reasonably high.
Although ROWS=1 is being shown in the Explain Plan for USERS table, but optimizer is spending most of its time doing a Full Table Scan at this step. Adding an index on USERS_USER_CD column of USERS table should do the trick.
After adding the index on USERS (USERS_USER_CD) column, the query seems to be flying.
SQL> SELECT Tech_PAC_ID, Users_USER_CD,
2 count(1) over (partition by Users_USER_CD) TOT_PAC
3 FROM Tech, Users
4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
5 AND Users_USER_CD = Tech_ETI_USER_CD
6 GROUP BY Tech_PAC_ID, Users_USER_CD;
TECH_PAC_N USERS_USER_CD TOT_PAC
---------- -------------------- ----------
236XXX123 ABX65842 5
Execution Plan
----------------------------------------------------------
Plan hash value: 1753916289
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 7 (15)| 00:00:01 |
| 1 | WINDOW BUFFER | | 1 | 33 | 7 (15)| 00:00:01 |
| 2 | SORT GROUP BY | | 1 | 33 | 7 (15)| 00:00:01 |
|* 3 | TABLE ACCESS BY INDEX ROWID | TECH | 1 | 15 | 4 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 33 | 6 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| USERS | 1 | 18 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | USERS_MUB_USER_CD_IDX | 1 | | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TECH_STS"='Y')
6 - filter("USERS_USER_CD"='ABX65842')
7 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6 consistent gets
1 physical reads
0 redo size
558 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
This time merely 6 “consistent gets” were required. It’s a 99%+ reduction in overall consistent gets.
As I mentioned earlier in the post, I see SQL statements similar to this one over and over again. Analytic function is a nice alternative to the traditional way of writing these types of queries that works extremely well and provides the performance needed for high numbers of executions in a high data volume environment.
More information on Analytic Functions can (should) be obtained from here:
Oracle Database SQL Reference 10g Release 2
On Top-n and Pagination Queries by Tom Kyte
AskTom
One Analytic Function Can do More Than a 1000 Lines of Code by Alex Nuitjen
Happy reading.