Hi,
Yesterday, I have come across yet another performance issue. I received a complaint from one of our developers that they have a business report which is running extremely slow.
I received his mail and attached was the query:
SELECT
XLT.VOU_DET.VOU_DATE,
XLT.VOU_DET.VALUE_DATE,
XLT.VOU_DET.DESCRIPTION,
XLT.VOU_DET.PIX_CODE,
VOU_TYPE.DESCR,
XLT.VOU_DET.PIX_BRN,
XLT.VOU_DET.DR_BAL_ORIG,
XLT.VOU_DET.CR_BAL_ORIG,
XLT.VOU_DET.CLOSING_BAL_ORIG,
XLT.VOU_LOOKUP.NAME,
XLT.VOU_DET.VOU_SEQ,
(SELECT TO_CHAR(X.OPEN_BAL_ORIG)
FROM XLT.VOU_DET X
WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy')
AND to_date('30-09-2008', 'dd-mm-yyyy')
AND X.VOU_CODE = '9900016WXYRT01'
AND X.VOU_SEQ = 1
AND ROWNUM = 1) OPEN_BAL_ORIG,
(SELECT count( XLT.VOU_DET.DR_BAL_ORIG)
FROM (select DR_BAL_ORIG
From XLT.VOU_DET X
WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy')
AND to_date('30-09-2008', 'dd-mm-yyyy')
AND X.VOU_CODE = '9900016WXYRT01'
AND X.DR_BAL_ORIG <>0)) DR_BAL_ORIG_CNT,
(SELECT count( XLT.VOU_DET.CR_BAL_ORIG)
FROM (select CR_BAL_ORIG
From XLT.VOU_DET X
WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy')
AND to_date('30-09-2008', 'dd-mm-yyyy')
AND X.VOU_CODE = '9900016WXYRT01'
AND X.CR_BAL_ORIG <>0)) CR_BAL_ORIG_CNT
FROM
XLT.VOU_DET,
XLT.X_VOU_TYPE VOU_TYPE,
XLT.VOU_LOOKUP
WHERE XLT.VOU_DET.VOU_TYPE_ID=VOU_TYPE.VOU_TYPE_ID
AND XLT.VOU_DET.VOU_REF(+)=XLT.VOU_LOOKUP.CUST_CODE_WNG
AND XLT.VOU_DET.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy') AND to_date('30-09-2008', 'dd-mm-yyyy')
AND XLT.VOU_DET.VOU_CODE = '9900016WXYRT01'
ORDER BY
XLT.VOU_DET.VOU_SEQ,
XLT.VOU_DET.ASOF_DATE;
This looks like a case of missing Analytic functions.
The three SELECT statements within the query can be easily replaced with simple Analytic functions. The first one is to fetch the opening balance:
(SELECT TO_CHAR(X.OPEN_BAL_ORIG)
FROM XLT.VOU_DET X
WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy')
AND to_date('30-09-2008', 'dd-mm-yyyy')
AND X.VOU_CODE = '9900016WXYRT01'
AND X.VOU_SEQ = 1
AND ROWNUM = 1) OPEN_BAL_ORIG,
and could be easily rewritten as:
first_value(OPEN_BAL_ORIG)
over (partition by VOU_CODE
Order by ASOF_DATE, decode(VOU_SEQ, 0, 99, VOU_SEQ)) VOU_SEQ,
The second and third are the number of Debit and Credit transactions respectively and can also be written as:
count(decode(DR_BAL_ORIG, null, null, 0, null, 1))
over (partition by VOU_CODE) DR_BAL_ORIG_cnt,
and
count(decode(CR_BAL_ORIG, null, null, 0, null, 1))
over (partition by VOU_CODE) CR_BAL_ORIG_cnt
Now, its time to execute the modified query and (1) compare the result set, (2) measure time taken, (3) and compare statistics with the original query.
Well, the result set was compared and was convincing. The enhanced query completes in less than 20 seconds while the original took just over 17 minutes. So, the new query is 173 times faster than the original one.
Lastly, let’s compare statistics.
Statistics of Original query:
SQL> set autotrace traceonly stat
SQL> @tuneme
661 rows selected.
Elapsed: 00:17:26.91
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2066944 consistent gets
753812 physical reads
116 redo size
68506 bytes sent via SQL*Net to client
842 bytes received via SQL*Net from client
46 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
661 rows processed
SQL>
Statistics of Modified query:
SQL> set autotrace traceonly exp stat
SQL> @tunedquery
661 rows selected.
Elapsed: 00:00:17.35
Statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
22514 consistent gets
2580 physical reads
0 redo size
68003 bytes sent via SQL*Net to client
547 bytes received via SQL*Net from client
46 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
661 rows processed
SQL>
Instead of “2,066,944” consistent gets and “753,812” physical reads, it took merely “22,514” consistent gets and “2,580” physical reads. That’s 98% reduction in LIO’s and 99%+ reduction in PIO.
Again Analytics Functions is the winner over traditional query writing style.
Nice one! I especially liked the trick of using decode to remove the zeroes from the count instead of the subquery predicate.
ReplyDeleteThough you're clearly avoiding many table accesses, you may consider adding the before/after execution plans -or better, the actual execution statistics ;)
Momen, I dont know anything analytic function. But, was having a doubt, did you flush the buffer cache in between.
ReplyDelete-Nand
Hi Nand,
ReplyDeleteNo, I did not flush the cache and I cannot afford to, as it is a production database.
So, might be in the second run(with analytic function), it got the blocks from the buffer cache, instead of getting them from datafile, thus its faster. PLease clarify...am i missing something?
ReplyDelete-Nand
Nand,
ReplyDeleteTrue, but if you analyze the query, instead of accessing VOU_DET table only once to retrieve the required output, the Developer was accessing it four times. This is where the LIO's were incurred from.
I will try to post execution plans of both the queries, which would enlighten the difference between both the queries.
Asif Momen