Views expressed here are solely that of my own. Please make sure that you test the code/queries that appear on my blog before applying them to the production environment.
Monday, December 08, 2008
"Heh, heh, that"s a joke, son"
Well, the title of this post is the error message from "Oracle® Database Error Messages: 10g Release 2 (10.2)" document.
PLS-00101: reserved for future use
Cause: This error message is not used yet.(Heh, heh, that"s a joke, son.)
Action: none
Although, it has been reserved for future use but somehow it got leaked into Oracle Database 10g documentation. However, this same error is missing from Oracle 11g Database Error Messages. I think its reserved to be used with Oracle 12g, who knows?
Has anybody encountered this error ??
Sunday, November 23, 2008
Dynamic Sampling Myths Dispelled
There are couple of myths associated with Dynamic Sampling, I thought of eliminating them with the help of this post. Two myths associated with DS are:
i) Default number of blocks sampled are 32 in Oracle 10g and
ii) Tables having less than 32 blocks are not considered by the optimizer for dynamic sampling.
Myth 1: Default number of blocks sampled are 32 in Oracle 10g
The default number of dynamic sampling blocks are 32. This is set using a hidden parameter “_optimizer_dyn_smp_blks” (ofcourse one should not change it). Here’s the query to see the default settings:
SQL> column "Parameter" format a30 SQL> column "Session Value" format a20 SQL> column "Instance Value" format a20 SQL> SQL> select a.ksppinm "Parameter", b.ksppstvl "Session Value", c.ksppstvl "Instance Value" 2 from x$ksppi a, x$ksppcv b, x$ksppsv c 3 where a.indx = b.indx and a.indx = c.indx 4 and ksppinm= '_optimizer_dyn_smp_blks'; Parameter Session Value Instance Value ------------------------------ -------------------- -------------------- _optimizer_dyn_smp_blks 32 32Starting with Oracle 10g, by default “optimizer_dynamic_sampling” initialization parameter is set to “2”, meaning optimizer will dynamically sample all unanalyzed tables. The number of blocks sampled is two times the default number of dynamic sampling blocks. So at this level, Optimizer will sample 64 blocks. To verify how many blocks are sampled, we need to dig into the 10053 trace output. For the purpose of this test, I will create a test table and dump some data into it. Later enable 10053 trace, execute the query and disable the trace.
SQL> drop table test purge; Table dropped. SQL> create table test as select * from all_objects where 1=2; Table created. SQL> create index t_object_id on test(object_id); Index created. SQL> insert into test select * from all_objects; 35023 rows created. SQL> commit; Commit complete. SQL> alter session set tracefile_identifier = 'blksize'; Session altered. SQL> alter session set events '10053 trace name context forever, level 1'; Session altered. SQL> select * from test where object_id = 123; no rows selected SQL> alter session set events '10053 trace name context off'; Session altered. SQL> disc Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL>Digging into the trace, we see the following:
** Executed dynamic sampling query: level : 2 sample pct. : 12.701613 actual sample size : 4097 filtered sample card. : 0 orig. card. : 82 block cnt. table stat. : 496 block cnt. for sampling: 496 max. sample block cnt. : 64 sample block cnt. : 63 min. sel. est. : 0.01000000Well, it’s clear for the trace that at level 2, the maximum number of blocks sampled (max. sample block cnt) are 64 and the actual number of blocks sampled (sample block cnt) are 63. (For some reason, Oracle always samples one block less) So, this myth has been uncovered and we know how many blocks are sampled. Myth 2: Tables having less than 32 blocks are not considered by the optimizer for dynamic sampling It is widely understood that, tables having less than 32 blocks are ignored by the optimizer for dynamic sampling. Let’s continue using the same table as in case 1. However, it contains data, so we will truncate it. After truncating, verify number of blocks allocated by querying USER_SEGMENTS view.
SQL> truncate table test; Table truncated. SQL> select blocks from user_segments where segment_name = 'TEST'; BLOCKS ---------- 8This table has only 8 blocks allocated. Let’s see whether optimizer considers dynamic sampling on this table by executing a simple query.
SQL> set autotrace traceonly exp SQL> select * from test where object_id = 123; Execution Plan ---------------------------------------------------------- Plan hash value: 1357081020 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 128 | 2 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST | 1 | 128 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_ID"=123) Note ----- - dynamic sampling used for this statement SQL> set autotrace off SQL>Yes, Oracle optimizer did consider dynamic sampling although the number of blocks were less than 32. So, in this post we were able to uncover myths associated with dynamic sampling through very simple and reproducible examples.
Wednesday, November 05, 2008
Analytic Functions: A Savior
Hi,
Yesterday, I have come across yet another performance issue. I received a complaint from one of our developers that they have a business report which is running extremely slow.
I received his mail and attached was the query:
SELECT XLT.VOU_DET.VOU_DATE, XLT.VOU_DET.VALUE_DATE, XLT.VOU_DET.DESCRIPTION, XLT.VOU_DET.PIX_CODE, VOU_TYPE.DESCR, XLT.VOU_DET.PIX_BRN, XLT.VOU_DET.DR_BAL_ORIG, XLT.VOU_DET.CR_BAL_ORIG, XLT.VOU_DET.CLOSING_BAL_ORIG, XLT.VOU_LOOKUP.NAME, XLT.VOU_DET.VOU_SEQ, (SELECT TO_CHAR(X.OPEN_BAL_ORIG) FROM XLT.VOU_DET X WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy') AND to_date('30-09-2008', 'dd-mm-yyyy') AND X.VOU_CODE = '9900016WXYRT01' AND X.VOU_SEQ = 1 AND ROWNUM = 1) OPEN_BAL_ORIG, (SELECT count( XLT.VOU_DET.DR_BAL_ORIG) FROM (select DR_BAL_ORIG From XLT.VOU_DET X WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy') AND to_date('30-09-2008', 'dd-mm-yyyy') AND X.VOU_CODE = '9900016WXYRT01' AND X.DR_BAL_ORIG <>0)) DR_BAL_ORIG_CNT, (SELECT count( XLT.VOU_DET.CR_BAL_ORIG) FROM (select CR_BAL_ORIG From XLT.VOU_DET X WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy') AND to_date('30-09-2008', 'dd-mm-yyyy') AND X.VOU_CODE = '9900016WXYRT01' AND X.CR_BAL_ORIG <>0)) CR_BAL_ORIG_CNT FROM XLT.VOU_DET, XLT.X_VOU_TYPE VOU_TYPE, XLT.VOU_LOOKUP WHERE XLT.VOU_DET.VOU_TYPE_ID=VOU_TYPE.VOU_TYPE_ID AND XLT.VOU_DET.VOU_REF(+)=XLT.VOU_LOOKUP.CUST_CODE_WNG AND XLT.VOU_DET.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy') AND to_date('30-09-2008', 'dd-mm-yyyy') AND XLT.VOU_DET.VOU_CODE = '9900016WXYRT01' ORDER BY XLT.VOU_DET.VOU_SEQ, XLT.VOU_DET.ASOF_DATE;This looks like a case of missing Analytic functions. The three SELECT statements within the query can be easily replaced with simple Analytic functions. The first one is to fetch the opening balance:
(SELECT TO_CHAR(X.OPEN_BAL_ORIG) FROM XLT.VOU_DET X WHERE X.ASOF_DATE BETWEEN to_date('01-05-2007', 'dd-mm-yyyy') AND to_date('30-09-2008', 'dd-mm-yyyy') AND X.VOU_CODE = '9900016WXYRT01' AND X.VOU_SEQ = 1 AND ROWNUM = 1) OPEN_BAL_ORIG,and could be easily rewritten as:
first_value(OPEN_BAL_ORIG) over (partition by VOU_CODE Order by ASOF_DATE, decode(VOU_SEQ, 0, 99, VOU_SEQ)) VOU_SEQ,The second and third are the number of Debit and Credit transactions respectively and can also be written as:
count(decode(DR_BAL_ORIG, null, null, 0, null, 1)) over (partition by VOU_CODE) DR_BAL_ORIG_cnt,and
count(decode(CR_BAL_ORIG, null, null, 0, null, 1)) over (partition by VOU_CODE) CR_BAL_ORIG_cntNow, its time to execute the modified query and (1) compare the result set, (2) measure time taken, (3) and compare statistics with the original query. Well, the result set was compared and was convincing. The enhanced query completes in less than 20 seconds while the original took just over 17 minutes. So, the new query is 173 times faster than the original one. Lastly, let’s compare statistics. Statistics of Original query:
SQL> set autotrace traceonly stat SQL> @tuneme 661 rows selected. Elapsed: 00:17:26.91 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2066944 consistent gets 753812 physical reads 116 redo size 68506 bytes sent via SQL*Net to client 842 bytes received via SQL*Net from client 46 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 661 rows processed SQL>Statistics of Modified query:
SQL> set autotrace traceonly exp stat SQL> @tunedquery 661 rows selected. Elapsed: 00:00:17.35 Statistics ---------------------------------------------------------- 2 recursive calls 0 db block gets 22514 consistent gets 2580 physical reads 0 redo size 68003 bytes sent via SQL*Net to client 547 bytes received via SQL*Net from client 46 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 661 rows processed SQL>Instead of “2,066,944” consistent gets and “753,812” physical reads, it took merely “22,514” consistent gets and “2,580” physical reads. That’s 98% reduction in LIO’s and 99%+ reduction in PIO. Again Analytics Functions is the winner over traditional query writing style.
Monday, November 03, 2008
How to read Data between different Characterset Databases - Part 2
Hazem Ameen has left a comment on my previous post, where he says he is able to read/write data over a database link between two databases having different charactersets.
We have your same environment and reading/writing over db link is fine. Oracle does character-set conversion between the 2 Arabic character-set as expected.
Client NLS_LANG: America_America.ar8mswin1256
Server1: ar8mswin1256
Server 2: ar8iso8859p6
I can only suspect your client NLS_LANG
Our version is 10.2.0.4 on Linux. Client is MS Windows Vista 10.2.0.4
reading/writing over db link is fine
What I would assert from this is, although you have databases with two different charactersets, your client characterset is SAME. This is the reason you are able to read/write over the db link. But, in our case, clients connecting to database A have same characterset as A, while clients connecting to database B have same characterset as B.
Here's a test case with my client characterset set to "AR8MSWIN1256".
Create tables in both the databases:
Create database links in both databases and query data from other side:
In the above example, we were able to read data over db links meaning client characterset's are playing the vital role.
This means, all your client's characterset is set to "AR8MSWIN1256" for both the types of databases. This is the reason you are able to read/write over a db link.
I hope I was clear in my examples. Thanks for leaving your comment.
Regards
Regards
Asif Momen
Tuesday, October 14, 2008
How to read Data between different Characterset Databases
We have two 24x7 production databases with different charactersets. Yes, both charactersets are related to Arabic language. Database A’s characterset is AR8MSWIN1256 while database B’s characterset is AR8ISO8859P6.
We had a requirement wherein database A will read Arabic data from database B. If you create a database link and try to query data from database B it will appear to be garbage, as shown below:
whereas when you run the same query on database B, this is how the output looks:
Even you are unaware of Arabic language, you may visually compare and see the output is a garbage when queried from database A.
DUMP function comes handy under these situations. It returns internal representation of an expression. When used to against the “name” failed, I get this output.
SQL> select dump(name) from test_iso@testdb; DUMP(NAME) -------------------------------------------------------------------- Typ=1 Len=13: 227,205,227,207,32,194,213,221,32,227,196,227,228where, “Typ” refers to datatype, “Len” means the length of the string, and everything after “:” is comma separated ASCII values of all the characters in the “name” field. I wrote a small procedure where convert all the comma separated ASCII values to character string.
create or replace function read_name (p_id in number) return varchar2 is t_str varchar2(2000); l_name varchar2(2000); l_output varchar2(2000); begin select 'select '||'chr('||replace(substr(dump(name, 1010), instr(dump(name, 1010), ':')+2), ',', ')|| chr(')||') from dual' into t_str from test_iso@testdb where id = p_id; -- dbms_output.put_line(t_str); execute immediate t_str into l_output; return l_output; end; /Now upon using this function, I am able to query remote table in a readable format. Here’s the output:
Sunday, September 21, 2008
Analytic Functions: The most ignored ones
Dear Readers,
Analytic Functions were first introduced in Oracle 8i, way back in 1999. Tom Kyte’s book “Expert One-on-One Oracle” has a dedicated chapter on this topic. Not only Tom’s book, any book on SQL (I have seen so far) has a separate chapter written on Analytic Functions. Yet, the developer community seems to be paying least attention in trying to understand and use them. I think before educating students, tutors should be trained to emphasize the importance of Analytic Functions in day to day life of a developer.
Anyways, here’s a similar case on one of our very busy OLTP database.
This SQL statement topped in the AWR report consuming nearly 84% of database time and was executed more than 1200 times during a 45-minutes AWR report.
A query is required to fetch TECH_PAC_ID for a customer along with the number of records for that customer.
Here’s the original query, its Explain Plan and Statistics:
SQL> set autotrace on SQL> SELECT Tech_PAC_ID, Users_USER_CD, TOT_PAC 2 FROM (SELECT Tech_PAC_ID, Users_USER_CD 3 FROM Tech, Users 4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' ) 5 AND Users_USER_CD = Tech_ETI_USER_CD) DET, 6 (SELECT COUNT(*) TOT_PAC 7 FROM Tech, Users 8 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' ) 9 AND Users_USER_CD = Tech_ETI_USER_CD) TOT; TECH_PAC_N USERS_USER_CD TOT_PAC ---------- -------------------- ---------- 236XXX123 ABX65842 5 Elapsed: 00:00:00.06 Execution Plan ---------------------------------------------------------- Plan hash value: 3349818188 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 54 | 1280 (1)| 00:00:16 | | 1 | MERGE JOIN CARTESIAN | | 1 | 54 | 1280 (1)| 00:00:16 | |* 2 | TABLE ACCESS BY INDEX ROWID | TECH | 1 | 18 | 4 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 41 | 640 (1)| 00:00:08 | |* 4 | TABLE ACCESS FULL | USERS | 1 | 23 | 636 (1)| 00:00:08 | |* 5 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 | | 6 | BUFFER SORT | | 1 | 13 | 1276 (1)| 00:00:16 | | 7 | VIEW | | 1 | 13 | 640 (1)| 00:00:08 | | 8 | SORT AGGREGATE | | 1 | 104 | | | |* 9 | TABLE ACCESS BY INDEX ROWID| TECH | 1 | 12 | 4 (0)| 00:00:01 | | 10 | NESTED LOOPS | | 1 | 104 | 640 (1)| 00:00:08 | |* 11 | TABLE ACCESS FULL | USERS | 1 | 92 | 636 (1)| 00:00:08 | |* 12 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("TECH_STS"='Y') 4 - filter("USERS_USER_CD"='ABX65842') 5 - access("USERS_USER_CD"="Tech_ETI_USER_CD") filter("TECH_ETI_USER_CD" IS NOT NULL) 9 - filter("TECH_STS"='Y') 11 - filter("USERS_USER_CD"='ABX65842') 12 - access("USERS_USER_CD"="Tech_ETI_USER_CD") filter("TECH_ETI_USER_CD" IS NOT NULL) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4671 consistent gets 0 physical reads 0 redo size 558 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed SQL>Instead of writing two queries to fetch TECH_PAC_ID and COUNT(USERS_USER_CD) we can achieve the same result by using Analytic Functions. The new query not only returns the required result but is also less resource intensive and more database-friendly. Here goes the enhanced query using Analytic Function:
SQL> SELECT Tech_PAC_ID, Users_USER_CD, 2 count(1) over (partition by Users_USER_CD) TOT_PAC 3 FROM Tech, Users 4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' ) 5 AND Users_USER_CD = Tech_ETI_USER_CD 6 GROUP BY Tech_PAC_ID, Users_USER_CD; TECH_PAC_N USERS_USER_CD TOT_PAC ---------- -------------------- ---------- 236XXX123 ABX65842 5 Elapsed: 00:00:00.04 Execution Plan ---------------------------------------------------------- Plan hash value: 1328229640 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 41 | 641 (1)| 00:00:08 | | 1 | WINDOW BUFFER | | 1 | 41 | 641 (1)| 00:00:08 | | 2 | SORT GROUP BY | | 1 | 41 | 641 (1)| 00:00:08 | |* 3 | TABLE ACCESS BY INDEX ROWID| TECH | 1 | 18 | 4 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 1 | 41 | 640 (1)| 00:00:08 | |* 5 | TABLE ACCESS FULL | USERS | 1 | 23 | 636 (1)| 00:00:08 | |* 6 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("TECH_STS"='Y') 5 - filter("USERS_USER_CD"='ABX65842') 6 - access("USERS_USER_CD"="Tech_ETI_USER_CD") filter("TECH_ETI_USER_CD" IS NOT NULL) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2335 consistent gets 0 physical reads 0 redo size 558 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL>The new query seems to be doing a good job. The “consistent gets” have dropped from 4671 to 2335. That’s nearly half of the original, the reason being: instead of hitting the table twice, we are getting the work done in one hit. But still, the consistent gets seems to be reasonably high. Although ROWS=1 is being shown in the Explain Plan for USERS table, but optimizer is spending most of its time doing a Full Table Scan at this step. Adding an index on USERS_USER_CD column of USERS table should do the trick. After adding the index on USERS (USERS_USER_CD) column, the query seems to be flying.
SQL> SELECT Tech_PAC_ID, Users_USER_CD, 2 count(1) over (partition by Users_USER_CD) TOT_PAC 3 FROM Tech, Users 4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' ) 5 AND Users_USER_CD = Tech_ETI_USER_CD 6 GROUP BY Tech_PAC_ID, Users_USER_CD; TECH_PAC_N USERS_USER_CD TOT_PAC ---------- -------------------- ---------- 236XXX123 ABX65842 5 Execution Plan ---------------------------------------------------------- Plan hash value: 1753916289 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 33 | 7 (15)| 00:00:01 | | 1 | WINDOW BUFFER | | 1 | 33 | 7 (15)| 00:00:01 | | 2 | SORT GROUP BY | | 1 | 33 | 7 (15)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID | TECH | 1 | 15 | 4 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 1 | 33 | 6 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| USERS | 1 | 18 | 2 (0)| 00:00:01 | |* 6 | INDEX RANGE SCAN | USERS_MUB_USER_CD_IDX | 1 | | 1 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("TECH_STS"='Y') 6 - filter("USERS_USER_CD"='ABX65842') 7 - access("USERS_USER_CD"="Tech_ETI_USER_CD") filter("TECH_ETI_USER_CD" IS NOT NULL) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 6 consistent gets 1 physical reads 0 redo size 558 bytes sent via SQL*Net to client 399 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed SQL>This time merely 6 “consistent gets” were required. It’s a 99%+ reduction in overall consistent gets. As I mentioned earlier in the post, I see SQL statements similar to this one over and over again. Analytic function is a nice alternative to the traditional way of writing these types of queries that works extremely well and provides the performance needed for high numbers of executions in a high data volume environment. More information on Analytic Functions can (should) be obtained from here: Oracle Database SQL Reference 10g Release 2 On Top-n and Pagination Queries by Tom Kyte AskTom One Analytic Function Can do More Than a 1000 Lines of Code by Alex Nuitjen Happy reading.
Saturday, September 20, 2008
Syntax Highlighter
Dear Readers,
Most of you might have experienced difficulies in placing code snippets in a nice formatted way on Blogger.com. I have been previously using Greg Houston's Code formatter, but when you have pipes ("|") in the code like in Explain Plan, then Blogger fails to display the code correctly.
But, recently I stumbled on one of the Karen Morton's post where she discusses a way to format your code snippets on blogger. Here are more details on highlightling your code in blogger.com using Syntax Highlighter.
The code below has been formatted using SyntaxHighlighter:
SQL> conn scott/tiger Connected. SQL> select * from tab; TNAME TABTYPE CLUSTERID ------------------------------ ------- ---------- DEPT TABLE EMP TABLE BONUS TABLE SALGRADE TABLE TEST_RUN TABLE PLAN_TABLE TABLE EMP_VIEW VIEW EMP2 TABLE 8 rows selected. SQL>More interesting stuff can by found on Fahd Shariff's blog and Morten Lyhr's blog. Happy formating !!!
Saturday, August 02, 2008
Dynamic Sampling and Table Partitions
Dynamic Sampling was first introduced in Oracle 9i Release 2 and the purpose is to improve query performance by determining more accurate estimates for predicate selectivity and statistics for tables and indexes.
Dynamic sampling is to be used predominantly with un-analyzed tables. Before hard parsing a query Oracle determines if dynamic sampling would improve query performance. If so, then the optimizer issues recursive SQL statements to estimate the necessary statistics dynamically.
Here’s an example of dynamic sampling:
Operating System: Windows XP
Database: Oracle 10g Release 2 (10.2.0.4)
SQL> create table t (a number); Table created. SQL> set autotrace traceonly exp SQL> select * from t; Execution Plan ---------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement SQL> set autotrace offIts clear from the above Execution Plan that dynamic sampling was performed. But it seems dynamic sampling is not partition friendly or rather dynamic sampling is partition unaware. What I mean is if you have a partitioned table and at least one partition is analyzed then dynamic sampling is NOT performed on any of the un-analyzed partitions. This is because Optimizer looks at columns like NUM_ROWS of DBA_TABLES view and considers that the table is analyzed. Let us create a partitioned table with a local index.
SQL> create table part( 2 sno number, 3 dt date, 4 remark varchar2(30)) 5 partition by range(dt) ( 6 partition p1 values less than (to_date('01-02-2008', 'dd-mm-yyyy')), 7 partition p2 values less than (to_date('01-03-2008', 'dd-mm-yyyy')), 8 partition p3 values less than (to_date('01-04-2008', 'dd-mm-yyyy')), 9 partition p4 values less than (to_date('01-05-2008', 'dd-mm-yyyy')) 10 ); Table created. SQL> SQL> create index part_idx on part(dt, sno) local; Index created. SQL>Now, populate all but the last partition with test data.
SQL> insert into part 2 select level, 3 to_date('01-01-2008', 'dd-mm-yyyy') + level - 1 , 4 'test' 5 from dual 6 connect by level <=90; 90 rows created. SQL> SQL> commit; Commit complete. SQL>Let us now collect statistics for partitions P1, P2, and P3.
SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p1', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p2', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p3', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> select table_name, num_rows 2 from user_tables 3 where table_name ='PART'; TABLE_NAME NUM_ROWS ------------------------------ ---------- PART 90 SQL> SQL> SQL> select partition_name, num_rows 2 from user_tab_partitions 3 where table_name = 'PART'; PARTITION_NAME NUM_ROWS ------------------------------ ---------- P1 31 P2 29 P3 30 P4 SQL> SQL> SQL> select partition_name, num_rows 2 from user_ind_partitions 3 where index_name = 'PART_IDX'; PARTITION_NAME NUM_ROWS ------------------------------ ---------- P1 31 P2 29 P3 30 P4 SQL>It’s time to test whether optimizer considers dynamic sampling or not.
SQL> set autotrace traceonly exp SQL> SQL> select * 2 from part 3 where dt = to_date('01-04-2008', 'dd-mm-yyyy') 4 and sno = 1; Execution Plan ---------------------------------------------------------- Plan hash value: 2633488982 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 2 (0)| 00:00:01 | | | | 1 | PARTITION RANGE SINGLE| | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 | |* 2 | TABLE ACCESS FULL | PART | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("DT"=TO_DATE(' 2008-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "SNO"=1) SQL> SQL> set autotrace off SQL>No! Optimizer thinks that it has enough information available at hand to with it rather than sampling dynamically. Now, let us populate the unanalyzed partition with data and re-run the same query.
SQL> insert into part select 1, to_date('01-04-2008', 'dd-mm-yyyy'), 'test' 2 from dual connect by level <=1000; 1000 rows created. SQL> SQL> commit; Commit complete. SQL> SQL> set autotrace traceonly exp SQL> SQL> SQL> select * from part 2 where dt = to_date('01-04-2008', 'dd-mm-yyyy') 3 and sno = 1; Execution Plan ---------------------------------------------------------- Plan hash value: 2633488982 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 2 (0)| 00:00:01 | | | | 1 | PARTITION RANGE SINGLE| | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 | |* 2 | TABLE ACCESS FULL | PART | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("DT"=TO_DATE(' 2008-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "SNO"=1) SQL> SQL> set autotrace offWell, the optimizer took the same plan because statistics were not up to date. Consider gathering statistics on the partition in concern and execute the same query again.
SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p4', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> set autotrace traceonly exp SQL> SQL> SQL> select * from part 2 where dt = to_date('01-04-2008', 'dd-mm-yyyy') 3 and sno = 1; Execution Plan ---------------------------------------------------------- Plan hash value: 2633488982 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000 | 16000 | 3 (0)| 00:00:01 | | | | 1 | PARTITION RANGE SINGLE| | 1000 | 16000 | 3 (0)| 00:00:01 | 4 | 4 | |* 2 | TABLE ACCESS FULL | PART | 1000 | 16000 | 3 (0)| 00:00:01 | 4 | 4 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("DT"=TO_DATE(' 2008-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "SNO"=1) SQL> SQL> set autotrace off SQL>Well this time optimizer had updated statistics which gives a better and clear picture to pick the most appropriate access path. So, optimizer thinks “TABLE ACCESS FULL” on partition P4 is appropriate and which makes sense too. I have purposely populated partition P4 with duplicate data so that we see a difference in optimizer access path. Finally, the conclusions are: • Dynamic sampling ignores partition level statistical information. • Always update statistics after data load process, or assign default statistics to unanalyzed partition. Happy reading.
Friday, August 01, 2008
Bugs in REMAINDER Function (Updated)
Dear all,
Regarding Bugs in REMAINDER Function, I have opened a call with Oracle Support and they say "REMAINDER function works as intended and this is a documentation issue which requires minor clarification".
Oracle documentation has modified the text as follows:
"If n1 != 0, then the remainder is n2 - (n1*N) where N is the integer nearest n2/n1. If n2/n1 equals x.5, then N is the nearest even integer."
A bug (7236077) has also been logged.
Regards
Sunday, July 27, 2008
Funny “ORA-00942: table or view does not exist” Error Message
Dear all,
We all know when “ORA-00942” error message is returned. For instance, if I try to query a non-existent table then Oracle joyfully returns this errors message.
SQL> select * from funny; select * from funny * ERROR at line 1: ORA-00942: table or view does not exist
“ORA-00942” error message is thrown on our face whenever we try to perform any action like CREATE, ALTER, DROP, INSERT, UPDATE, DELETE ... against a non-existent table. But this error is sometimes misleading, like the one below:
SQL> create table t(a number); Table created. SQL> create view v as select * from t; View created. SQL> drop table v; drop table v * ERROR at line 1: ORA-00942: table or view does not existError message reports “table or view does not exist”, but definitely a view named “V” exists in the same schema. Wouldn’t it be more appropriate if Oracle reports “table does not exist”?
Thursday, July 10, 2008
MIN and MAX Functions in a Single Query are Disastrous
Dear Readers,
I would like to discuss a very interesting point about indexes in this post. When we are interested in finding out the minimum value of an indexed column, instead of reading entire table or the index, Oracle intelligently uses the index to navigate to the first index leaf block (leftmost index block) and quickly finds the minimum value of an indexed column.
A simple demo proves this:
SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod PL/SQL Release 10.2.0.4.0 - Production CORE 10.2.0.4.0 Production TNS for 32-bit Windows: Version 10.2.0.4.0 - Production NLSRTL Version 10.2.0.4.0 - Production SQL> SQL> drop table t purge; Table dropped. SQL> SQL> SQL> create table t as select level sno, 'name ' || level name 2 from dual connect by level <= 10000000; Table created. SQL> SQL> create unique index t_idx on t(sno); Index created. SQL> exec dbms_stats.gather_table_stats(user, 't'); PL/SQL procedure successfully completed. SQL> set autotrace on SQL> SQL> select min(sno) from t; MIN(SNO) ---------- 1 Execution Plan ---------------------------------------------------------- Plan hash value: 2683064407 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 6 | 9034 (2)| 00:01:49 | | 1 | SORT AGGREGATE | | 1 | 6 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| T_IDX | 10M| 57M| | | ------------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 3 consistent gets 1 physical reads 0 redo size 411 bytes sent via SQL*Net to client 399 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL> set autotrace off SQL>Although this table contains 10 Million rows, Oracle required only 3 consistent gets to fetch the minimum value. Superb !!! Similarly, when finding out the maximum value, Oracle reads the last block on the right-hand side of the index structure.
SQL> set autotrace on SQL> select max(sno) from t; MAX(SNO) ---------- 10000000 Execution Plan ---------------------------------------------------------- Plan hash value: 2683064407 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 6 | 9034 (2)| 00:01:49 | | 1 | SORT AGGREGATE | | 1 | 6 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| T_IDX | 10M| 57M| | | ------------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 3 consistent gets 2 physical reads 0 redo size 411 bytes sent via SQL*Net to client 399 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> set autotrace off SQL>Once again only 3 consistent gets were required to fetch the maximum value. But things get messy when you use both MIN and MAX functions in the same query. Instead of using same "INDEX FULL SCAN (MIN/MAX)" path to read the left-most block and right-most block to arrive at the minimum and maximum values, Oracle goes with FULL TABLE SCAN. A Full Table Scan on 10 Million rows !!!
SQL> set autotrace on SQL> select min(sno), max(sno) from t; MIN(SNO) MAX(SNO) ---------- ---------- 1 10000000 Execution Plan ---------------------------------------------------------- Plan hash value: 2966233522 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 9034 (2)| 00:01:49 | | 1 | SORT AGGREGATE | | 1 | 6 | | | | 2 | TABLE ACCESS FULL| T | 10M| 57M| 9034 (2)| 00:01:49 | --------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 32928 consistent gets 11391 physical reads 0 redo size 472 bytes sent via SQL*Net to client 399 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> set autotrace off SQL>From this terrible behavior of Oracle Optimizer, What I infer and suggest is: "Write separate queries to fetch MIN and MAX values instead of combining them into one query".
Saturday, July 05, 2008
Bugs in REMAINDER Function
While browsing through SQL Reference documentation (Oracle 10g Release 2), I stopped at REMAINDER function. The REMAINDER returns the remainder of n2 divided by n1 and internally uses ROUND function to arrive at the remainder value.
Oracle documentation also says, "If n1 != 0, then the remainder is n2 - (n1*N) where N is the integer nearest n2/n1."
SQL> Select REMAINDER(501, 5) from dual; REMAINDER(501,5) ---------------- 1 SQL>
But, because of an internal bug, REMAINDER function returns incorrect results when "n2/n1" results in 0.5, 1.5, 2.5, .....
SQL> Select REMAINDER(5, 2) from dual; REMAINDER(5,2) -------------- 1 SQL>According to the formula,
n2 - (n1 * N) = 5 - (2 * ROUND(5/2)) = 5 - (2 * 3) = 5 - 6 = -1the value returned should be "-1" where as the REMAINDER function returns "+1". My observation is that, for example "1.5" is getting rounded to "1" instead of "2" and is spoiling the outcome of REMAINDER. I wrote a small PL/SQL block to screen my observation:
SQL> set linesize 1000 SQL> set pages 100 SQL> set serveroutput on SQL> SQL> declare 2 n1 number; 3 n2 number; 4 N number; 5 6 sys_rem number; 7 my_rem number; 8 begin 9 for n1 in 1..5 loop 10 --for n2 in 1..50 loop 11 n2 := 1; 12 while n2 <= 20 loop 13 Sys_rem := REMAINDER(n2, n1); 14 N := Round(n2/n1); 15 My_rem := n2 - (n1 * N); 16 If sys_rem <> my_rem then 17 Dbms_output.put_line('n2 : ' || rpad(to_char(n2), 4, ' ') || 18 ' n1 : ' || to_char(n1) || 19 ' ' || 20 ' n2/n1 = ' || rpad(to_char(n2/n1), 4, ' ') || 21 ' ' || 22 ' ORCL rem = ' || rpad(to_char(sys_rem), 4, ' ') || 23 ' ' || 24 ' Calc Rem [' || rpad(to_char(n2), 4, ' ') || 25 ' - (' || to_char(n1) || '*' || 26 rpad(to_char(N), 4, ' ') || ')] = '|| 27 to_char(my_rem)); 28 End If; 29 n2 := n2 + 0.1; 30 end loop; 31 end loop; 32 end; 33 / n2 : 2.5 n1 : 1 n2/n1 = 2.5 ORCL rem = .5 Calc Rem [2.5 - (1*3 )] = -.5 n2 : 4.5 n1 : 1 n2/n1 = 4.5 ORCL rem = .5 Calc Rem [4.5 - (1*5 )] = -.5 n2 : 6.5 n1 : 1 n2/n1 = 6.5 ORCL rem = .5 Calc Rem [6.5 - (1*7 )] = -.5 n2 : 8.5 n1 : 1 n2/n1 = 8.5 ORCL rem = .5 Calc Rem [8.5 - (1*9 )] = -.5 n2 : 10.5 n1 : 1 n2/n1 = 10.5 ORCL rem = .5 Calc Rem [10.5 - (1*11 )] = -.5 n2 : 12.5 n1 : 1 n2/n1 = 12.5 ORCL rem = .5 Calc Rem [12.5 - (1*13 )] = -.5 n2 : 14.5 n1 : 1 n2/n1 = 14.5 ORCL rem = .5 Calc Rem [14.5 - (1*15 )] = -.5 n2 : 16.5 n1 : 1 n2/n1 = 16.5 ORCL rem = .5 Calc Rem [16.5 - (1*17 )] = -.5 n2 : 18.5 n1 : 1 n2/n1 = 18.5 ORCL rem = .5 Calc Rem [18.5 - (1*19 )] = -.5 n2 : 1 n1 : 2 n2/n1 = .5 ORCL rem = 1 Calc Rem [1 - (2*1 )] = -1 n2 : 5 n1 : 2 n2/n1 = 2.5 ORCL rem = 1 Calc Rem [5 - (2*3 )] = -1 n2 : 9 n1 : 2 n2/n1 = 4.5 ORCL rem = 1 Calc Rem [9 - (2*5 )] = -1 n2 : 13 n1 : 2 n2/n1 = 6.5 ORCL rem = 1 Calc Rem [13 - (2*7 )] = -1 n2 : 17 n1 : 2 n2/n1 = 8.5 ORCL rem = 1 Calc Rem [17 - (2*9 )] = -1 n2 : 1.5 n1 : 3 n2/n1 = .5 ORCL rem = 1.5 Calc Rem [1.5 - (3*1 )] = -1.5 n2 : 7.5 n1 : 3 n2/n1 = 2.5 ORCL rem = 1.5 Calc Rem [7.5 - (3*3 )] = -1.5 n2 : 13.5 n1 : 3 n2/n1 = 4.5 ORCL rem = 1.5 Calc Rem [13.5 - (3*5 )] = -1.5 n2 : 19.5 n1 : 3 n2/n1 = 6.5 ORCL rem = 1.5 Calc Rem [19.5 - (3*7 )] = -1.5 n2 : 2 n1 : 4 n2/n1 = .5 ORCL rem = 2 Calc Rem [2 - (4*1 )] = -2 n2 : 10 n1 : 4 n2/n1 = 2.5 ORCL rem = 2 Calc Rem [10 - (4*3 )] = -2 n2 : 18 n1 : 4 n2/n1 = 4.5 ORCL rem = 2 Calc Rem [18 - (4*5 )] = -2 n2 : 2.5 n1 : 5 n2/n1 = .5 ORCL rem = 2.5 Calc Rem [2.5 - (5*1 )] = -2.5 n2 : 12.5 n1 : 5 n2/n1 = 2.5 ORCL rem = 2.5 Calc Rem [12.5 - (5*3 )] = -2.5 PL/SQL procedure successfully completed. SQL>I have tested this code on Oracle 10g Release 2. Unfortunately, this bug also exists in Oracle 11g. I have raised a Service Request and also filed a bug with Oracle Support. Regards
Saturday, June 28, 2008
OTN - Discussion Forums in a great new look
Dear Readers,
Morning (GMT+3) when I tried to login to OTN-Discussion forum, it was down. But later when I checked it again, it's with a great new look.
With the new look comes new features. Following are the new features I have noticed:
* Great new look
* Add smiley's to messages
* Spell Check
* Formatting
* Reward points
* Save messages
* Tag messages and
* Abuse alerts
It should be easier for the OTN-newbie's to post their questions in a well formatted and readable format.
Enjoy discussions !!!
Tuesday, June 24, 2008
Why my index is not used?
Well, this question keeps popping up now and then. Yesterday, one of my colleagues also came up with this question: "Why is it that Oracle is not using index even though I am selecting less than 10% of data?".
We ran the query with autotrace enabled and the execution plan showed a Full Table Scan. This table contains over 29 Million records and by adding the predicate, result set is reduced to 2.3 Million records, which is 8% of total records.
SQL> set autotrace traceonly exp SQL> SELECT * 2 FROM quint_sec_tbl 3 WHERE quint_type = 'XX06FR'; Execution Plan ---------------------------------------------------------- Plan hash value: 3917650069 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3440 | 618K| 454K (1)| 01:30:55 | |* 1 | TABLE ACCESS FULL| QUINT_SEC_TBL | 3440 | 618K| 454K (1)| 01:30:55 | ----------------------------------------------------------------------------------------Cary Millsap's very old (but still valid) article "When to Use an Index" greatly helped me in this regard. This article unveils that Index consideration is based on block selectivity and not on row selectivity. He also defines what Row Selectivity and Block Selectivity are in the article. Row Selectivity: You can define the row selectivity of a given where-clause predicate as the number of rows returned by the predicate (r) divided by the total number of rows in the table (R):
P(r) = r / R
Block Selectivity:
You can define the block selectivity of a given where-clause predicate analogously, as the number of data blocks containing at least one row matching the predicate condition (b) divided by the total number of data blocks below the high-water mark (B):
P(b) = b / B
We can calculate block selectivity and row selectivity using SQL provided in this article. I used this SQL against my query and following are the results:
SQL> @hds TableOwner : MYPRODUSER TableName : QUINT_SEC_TBL ColumnList : QUINT_TYPE WhereClause: PageSize : 100 Table blocks below hwm Table rows (B) (R) ---------------------- ---------------- 1,672,704 29,270,757 More: Block selectivity Block count Row selectivity Row count QUINT_ (pb = b/B) (b) (pr = r/R) (r) ------ ----------------- -------------- ----------------- ---------------- TT34DV 45.03% 753,277 37.99% 11,120,869 FG76SC 44.47% 743,788 13.67% 4,000,205 LH23Q2 42.78% 715,558 9.44% 2,762,284 XX06FR 42.32% 707,894 8.02% 2,346,846 : : :Output of this SQL is sorted in descending order of block selectivity. Looking at this output, row selectivity is only 8% but to fetch these 8% of rows Oracle has to visit 42% of blocks (block selectivity). That means, nearly half of the table's blocks contain at least one row for which QUINT_TYPE='XX06FR'. Instead of going through the hard path of Index Access, it’s more efficient for the optimizer to do a Full Table Scan. So, now we know why the index was ignored and a Full Table Scan was preferred. P.S.: Due to security reasons Username, Table name, column name and column values are modified.
Wednesday, June 18, 2008
SQL for Loan Payment
Dear all,
Inspired by "SQL for Buying a New Car", I thought of converting a very old excel sheet of mine into an SQL query. This excel sheet lists equated monthly installment (EMI), principal amount and the interest amount for duration of loan payment. I feed Rate of Interest, Loan Amount and Period of Loan (in years) to the excel sheet. This was very convenient for me to analyze what part of my EMI is going towards interest payment and what portion goes as principal repayment. It is extremely important to understand what goes for interest and what goes for principal repayment when you are planning for a loan.
Output from Excel sheet:
This can easily be converted into an SQL query using MODEL clause introduced in Oracle 10g.
The output of this query includes a summary of total interest to be paid, principal to be paid and the net payment to be made. Ofcourse, we need to ignore the Summary on "Remaining" column as this is meaningless.
SQL> set line 10000 SQL> set pages 500 SQL> set verify off SQL> SQL> column interest format 999,999,999.99 SQL> column principal format 999,999,999.99 SQL> column remaining format 999,999,999.99 SQL> SQL> define roi = 10 -- Rate of Interest SQL> define years = 2 -- Payment Period SQL> define amt = 100000 -- Loan Amount SQL> define period = 12 -- Mode of payment. SQL> -- Monthly=12, Bi-monthly=48, Quarterly=4, Half-yearly=2, Yearly=1, Adhoc=n SQL> SQL> SQL> Select inst_no, 2 sum(new_emi) emi, 3 sum(new_interest) interest, 4 sum(principal) principal, 5 sum(remaining) remaining 6 From ( Select inst_no + 1 inst_no, 7 new_emi, 8 interest, 9 decode( inst_no + 1, &period*&years, interest - remaining, interest) new_interest, 10 decode( inst_no + 1, &period*&years, principal + remaining, principal) principal, 11 decode( inst_no + 1, &period*&years, 0, remaining) remaining 12 From (select rownum inst_no, roi, yrs, amt, period, 13 round( (roi * amt/(period * 100)) / 14 (1 - power((1 + roi/(period * 100)), -yrs * period)), 2) emi 15 From (select &roi roi, &years yrs, &amt amt, &period period 16 From Dual) ) 17 MODEL 18 DIMENSION BY ( inst_no ) 19 MEASURES ( roi, yrs, amt, emi , period, 0 as new_emi, 0 as interest, 0 as principal, 0 as remaining) 20 RULES 21 UPSERT 22 ITERATE (1000) 23 ( 24 new_emi[ITERATION_NUMBER] = emi[1], 25 interest[ITERATION_NUMBER] = round(&roi/(period[1] * 100) * 26 nvl(remaining[ITERATION_NUMBER-1], amt[1]), 2), 27 principal[ITERATION_NUMBER] = emi[1] - interest[cv()], 28 remaining[ITERATION_NUMBER] = nvl(remaining[ITERATION_NUMBER-1], amt[1]) - principal[cv()])) 29 Where interest > 0 30 Group by rollup( inst_no) 31 order by inst_no; INST_NO EMI INTEREST PRINCIPAL REMAINING ---------- ---------- --------------- --------------- --------------- 1 4614.49 833.33 3,781.16 96,218.84 2 4614.49 801.82 3,812.67 92,406.17 3 4614.49 770.05 3,844.44 88,561.73 4 4614.49 738.01 3,876.48 84,685.25 5 4614.49 705.71 3,908.78 80,776.47 6 4614.49 673.14 3,941.35 76,835.12 7 4614.49 640.29 3,974.20 72,860.92 8 4614.49 607.17 4,007.32 68,853.60 9 4614.49 573.78 4,040.71 64,812.89 10 4614.49 540.11 4,074.38 60,738.51 11 4614.49 506.15 4,108.34 56,630.17 12 4614.49 471.92 4,142.57 52,487.60 13 4614.49 437.40 4,177.09 48,310.51 14 4614.49 402.59 4,211.90 44,098.61 15 4614.49 367.49 4,247.00 39,851.61 16 4614.49 332.10 4,282.39 35,569.22 17 4614.49 296.41 4,318.08 31,251.14 18 4614.49 260.43 4,354.06 26,897.08 19 4614.49 224.14 4,390.35 22,506.73 20 4614.49 187.56 4,426.93 18,079.80 21 4614.49 150.67 4,463.82 13,615.98 22 4614.49 113.47 4,501.02 9,114.96 23 4614.49 75.96 4,538.53 4,576.43 24 4614.49 38.06 4,576.43 .00 110747.76 10,747.76 100,000.00 1,189,739.34 25 rows selected. SQL>Results of this query can be verified here. References: 1. Loan Calculator. 2. How to calculate EMI. Happy reading.
Friday, June 06, 2008
Create a Unique Constraint Ignoring Existing Duplicate Data
Hi,
There was an interesting post on OTN forums, where the OP asks:
Can we create a unique index on a column having duplicate values which can enforce the uniqueness for future values.?
Well, this is possible as Oracle is rich in features. It can be achieved by creating a unique constraint with DEFERRABLE and NOVALIDATE keywords. By default, constraints are created as NON DEFERRABLE and VALIDATE. When we create a constraint with DEFERRABLE and NOVALIDATE, we instruct Oracle to defer checking existing data for uniqueness but always check the new or modified rows.
Here's how we can do this:
SQL> drop table t purge; Table dropped. SQL> SQL> create table t 2 ( x int ); Table created. SQL> SQL> SQL> insert into t values ( 1 ); 1 row created. SQL> insert into t values ( 1); 1 row created. SQL> SQL> alter table t add constraint t_uk unique (x) 2 deferrable initially immediate novalidate; Table altered. SQL> select * from t; X ---------- 1 1 SQL> SQL> alter table t modify constraint t_uk enable novalidate; Table altered. SQL> SQL> insert into t values ( 1); insert into t values ( 1) * ERROR at line 1: ORA-00001: unique constraint (TEST.T_UK) violated SQL> select * from t; X ---------- 1 1 SQL> insert into t values (2); 1 row created. SQL>Happy reading !!!
Monday, May 26, 2008
Database is very slow!!!
Dear Readers,
I received a call from one of our developers and following was the conversation that took place between us:
He: “Database is very slow”
Me: What is very slow? Can you tell me what actually you are doing?
He: I am running a simple report which is supposed to return less than 100 records. But it’s been more than 4 hours and the query is still running.
Me: How long it use to take before?
He: This is the first time we are running this query.
Me: Ok, let me log in to the database.
I logged into the database and ran my set of commands to trace the culprit SQL. I was able to identify the query and here it is:
He: We need to find all the information from “TRANSACTION_DAILY” table which does not exist in “TRANSACTION_DETAILS” table and vice versa. Yes, I was right. It’s doing a “Full Outer Join”. I modified his original query and proposed him a new query which is executing at lighting speed and more importantly yielding the same result:
He: How long does it take to execute?
Me: It completes in less than 4 seconds.
He: What...? … less than 4 seconds … ? … Are you sure, have you included all the condition that I have mentioned?
Me: Yes, let us test it. He made couple of tests and the query was perfect to all the test cases but except one. Requirements started getting little messy so, I decided to create two tables and test the query against them. I created two tables, A and B, of same table structure with sample data:
Me: What is very slow? Can you tell me what actually you are doing?
He: I am running a simple report which is supposed to return less than 100 records. But it’s been more than 4 hours and the query is still running.
Me: How long it use to take before?
He: This is the first time we are running this query.
Me: Ok, let me log in to the database.
I logged into the database and ran my set of commands to trace the culprit SQL. I was able to identify the query and here it is:
SELECT TO_CHAR(TRN_DATE, 'YYYY-MM-DD HH24:MI:SS') TRN_DATE, TRAN_TYPE, TRAN_REF, KEY_NO, IN_AMT A1, 0 A2 FROM TRANSACTION_DAILY WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) AND SRC_FLAG ='L' AND (SUBSTR(OUT_KEY_NO,1,10) = 'DXIY0-19XV' or SUBSTR(IN_KEY_NO,1,10) = 'DXIY0-19XV') AND TRAN_TYPE NVL(KEY_NO,'A') NVL(IN_AMT,0) NOT IN ( SELECT TD_TRAN_TYPE TD_KEY_NO_REF NVL(OUT_AMT,0) FROM TRANSACTION_DETAILS WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) AND RM_CODE = 'XXX123' ) UNION SELECT TO_CHAR(TRN_DATE, 'YYYY-MM-DD HH24:MI:SS') TRN_DATE, TD_TRAN_TYPE, TRAN_REF, TD_KEY_NO_REF, 0 A1, OUT_AMT A2 FROM TRANSACTION_DETAILS WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) AND RM_CODE = 'XXX123' AND TD_TRAN_TYPE TD_KEY_NO_REF NVL(OUT_AMT,0) NOT IN ( SELECT TRAN_TYPE NVL(KEY_NO,'A') NVL(IN_AMT,0) FROM TRANSACTION_DAILY WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) AND SRC_FLAG ='L' AND (SUBSTR(OUT_KEY_NO,1,10) = 'DXIY0-19XV' or SUBSTR(IN_KEY_NO,1,10) = 'DXIY0-19XV') )and this is the execution plan:
------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 51M(100)| | | | | 1 | SORT UNIQUE | | 2842 | 159K| 51M (70)|198:47:09 | | | | 2 | UNION-ALL | | | | | | | | |* 3 | FILTER | | | | | | | | | 4 | PARTITION RANGE SINGLE | | 2042 | 135K| 2854 (2)| 00:00:40 | KEY | KEY | |* 5 | TABLE ACCESS FULL | TRANSACTION_DAILY | 2042 | 135K| 2854 (2)| 00:00:40 | KEY | KEY | | 6 | PARTITION RANGE SINGLE | | 760 | 19000 | 15269 (1)| 00:03:34 | KEY | KEY | |* 7 | TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION_DETAILS | 760 | 19000 | 15269 (1)| 00:03:34 | KEY | KEY | |* 8 | INDEX RANGE SCAN | TRAN_DET_IDX | 434K| | 980 (1)| 00:00:14 | KEY | KEY | |* 9 | FILTER | | | | | | | | | 10 | PARTITION RANGE SINGLE | | 800 | 24800 | 15269 (1)| 00:03:34 | KEY | KEY | |* 11 | TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION_DETAILS | 800 | 24800 | 15269 (1)| 00:03:34 | KEY | KEY | |* 12 | INDEX RANGE SCAN | TRAN_DET_IDX | 434K| | 980 (1)| 00:00:14 | KEY | KEY | |* 13 | TABLE ACCESS BY GLOBAL INDEX ROWID| TRANSACTION_DAILY | 1940 | 111K| 88735 (1)| 00:20:43 | ROW L | ROW L | |* 14 | INDEX RANGE SCAN | PK_TRANSACTION_DAILY | 102K| | 413 (1)| 00:00:06 | | | -------------------------------------------------------------------------------------------------------------------------------From the above execution plan, Optimizer thinks it needs “198:47:09” hours to “Sort Unique” the result set. But the actual time the query took to complete was 6 hours and 49 minute (24540 seconds). I then simplified the query in order to understand it clearly by removing some of the predicates. The simplified query is:
SELECT * FROM TRANSACTION_DAILY WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) AND TRAN_TYPE || NVL(KEY_NO,'A') || NVL(IN_AMT,0) NOT IN ( SELECT TD_TRAN_TYPE || TD_KEY_NO_REF || NVL(OUT_AMT,0) FROM TRANSACTION_DETAILS WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) ) UNION SELECT * FROM TRANSACTION_DETAILS WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) AND TD_TRAN_TYPE || TD_KEY_NO_REF || NVL(OUT_AMT,0) NOT IN ( SELECT TRAN_TYPE || NVL(KEY_NO,'A') || NVL(IN_AMT,0) FROM TRANSACTION_DAILY WHERE TRN_DATE = TO_DATE ( '#BUSDATE#', 'DDMMYYYY' ) )Hmmmm, this query seems to be doing a full outer join on “TRANSACTION_DAILY” and “TRANSACTION_DETAILS” tables. I called him back: Me: What is the objective of this report?
He: We need to find all the information from “TRANSACTION_DAILY” table which does not exist in “TRANSACTION_DETAILS” table and vice versa. Yes, I was right. It’s doing a “Full Outer Join”. I modified his original query and proposed him a new query which is executing at lighting speed and more importantly yielding the same result:
SELECT nvl(a.TRN_DATE, b.TRN_DATE) TRN_DATE, a.TRN_TYPE, b.OTH_TRN_TYPE, a.KEY_NO, b.KEY_NO_REF, nvl(a.IN_AMT, 0) A1, nvl(b.OUT_AMT, 0) A2 FROM (select TRN_DATE, TRN_TYPE, KEY_NO, IN_AMT, from TRANSACTION_DAILY where TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY') and (substr(IN_KEY_NO, 1, 10) = 'DXIY0-19XV' OR substr(OUT_KEY_NO, 1, 10) = 'DXIY0-19XV')) a FULL OUTER JOIN (select TRN_DATE, OTH_TRN_TYPE, KEY_NO_REF, OUT_AMT, from TRANSACTION_DETAILS where RM_CODE = 'XXX123' and TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY')) b ON (a.TRN_DATE = b.TRN_DATE and a.TRN_TYPE = b.OTH_TRN_TYPE and a.KEY_NO = b.KEY_NO_REF and a.IN_AMT = b.OUT_AMT and a.rno = b.rno) WHERE (a.TRN_TYPE is null or b.OTH_TRN_TYPE is null or a.KEY_NO is null or b.KEY_NO_REF is null or a.IN_AMT is null or b.OUT_AMT is null);I called him back with over-excitement announcing him that the query is ready and let us test it: Me: I have re-written the query, kindly come over to my office so that we can test it together.
He: How long does it take to execute?
Me: It completes in less than 4 seconds.
He: What...? … less than 4 seconds … ? … Are you sure, have you included all the condition that I have mentioned?
Me: Yes, let us test it. He made couple of tests and the query was perfect to all the test cases but except one. Requirements started getting little messy so, I decided to create two tables and test the query against them. I created two tables, A and B, of same table structure with sample data:
create table a(id number, amt number); insert into a values (1, 10); insert into a values (2, 20); insert into a values (3, 30); insert into a values (4, 40); create table b(id number, amt number); insert into b values (2, 20); insert into b values (3, 30); insert into b values (5, 50); insert into b values (6, 60); commit;Basically, we need to find out data that exists in A and is not in B and also records that exist in B which are not A. Here’s the query:
SQL> select * 2 from a FULL OUTER JOIN b 3 on a.id = b.id 4 where a.id is null or b.id is null; ID AMT ID AMT ---------- ---------- ---------- ---------- 4 40 6 60 5 50 SQL>He said: “Yes, the query is fetching right data, but what happens when you have two records with same id and amt in table A and only a single record in B? In this case, the query should display one record from table A.” Adding more to the complexity he said: “There is no third column in either table to distinguish this occurrence.” So, we inserted a record in each table and ran the same query:
insert into a values (1, 10); insert into b values (1, 10); SQL> select * 2 from a FULL OUTER JOIN b 3 on a.id = b.id 4 where a.id is null or b.id is null; ID AMT ID AMT ---------- ---------- ---------- ---------- 4 40 6 60 5 50 SQL>Oops!! The query fails at this point. Ok, I then decided to re-write this query and this time use analytical functions to rescue me from the current problematic situation.
SQL> select x.id, x.amt, y.id, y.amt 2 from (select id, amt, 3 row_number() over (partition by id, amt 4 order by id, amt) rno 5 from a) x 6 FULL OUTER JOIN 7 (select id, amt, 8 row_number() over (partition by id, amt 9 order by id, amt) rno 10 from b) y 11 on x.id = y.id and x.rno = y.rno 12 where x.id is null 13 or y.id is null 14 or x.rno is null 15 or y.rno is null; ID AMT ID AMT ---------- ---------- ---------- ---------- 1 10 4 40 5 50 6 60 SQL>Yippy!!! This query rocks. I then quickly transformed the original query into this form and ran it again:
SELECT nvl(a.TRN_DATE, b.TRN_DATE) TRN_DATE, a.TRN_TYPE, b.OTH_TRN_TYPE, a.KEY_NO, b.KEY_NO_REF, nvl(a.IN_AMT, 0) A1, nvl(b.OUT_AMT, 0) A2 FROM (select TRN_DATE, TRN_TYPE, KEY_NO, IN_AMT, row_number() over (partition by KEY_NO, TRN_TYPE, IN_AMT order by KEY_NO, TRN_TYPE, IN_AMT) rno from TRANSACTION_DAILY where TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY') and (substr(IN_KEY_NO, 1, 10) = 'DXIY0-19XV' OR substr(OUT_KEY_NO, 1, 10) = 'DXIY0-19XV')) a FULL OUTER JOIN (select TRN_DATE, OTH_TRN_TYPE, KEY_NO_REF, OUT_AMT, row_number() over (partition by KEY_NO_REF, OTH_TRN_TYPE, OUT_AMT order by KEY_NO_REF, OTH_TRN_TYPE, OUT_AMT) rno from TRANSACTION_DETAILS where RM_CODE = 'XXX123' and TRN_DATE = to_date('16-05-2008', 'DD-MM-YYYY')) b ON (a.TRN_DATE = b.TRN_DATE and a.TRN_TYPE = b.OTH_TRN_TYPE and a.KEY_NO = b.KEY_NO_REF and a.IN_AMT = b.OUT_AMT and a.rno = b.rno) WHERE (a.TRN_TYPE is null or b.OTH_TRN_TYPE is null or a.KEY_NO is null or b.KEY_NO_REF is null or a.IN_AMT is null or b.OUT_AMT is null or a.rno is null or b.rno is null);Execution plan of this query is:
---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2230 | 409K| 48722 (1)| 00:11:23 | | | | 1 | VIEW | | 2230 | 409K| 48722 (1)| 00:11:23 | | | | 2 | UNION-ALL | | | | | | | | |* 3 | FILTER | | | | | | | | |* 4 | HASH JOIN RIGHT OUTER | | 1552 | 175K| 24361 (1)| 00:05:42 | | | | 5 | VIEW | | 678 | 40002 | 14852 (1)| 00:03:28 | | | | 6 | WINDOW SORT | | 678 | 12882 | 14852 (1)| 00:03:28 | | | | 7 | PARTITION RANGE SINGLE | | 678 | 12882 | 14851 (1)| 00:03:28 | 14 | 14 | |* 8 | TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTION_DETAILS | 678 | 12882 | 14851 (1)| 00:03:28 | 14 | 14 | |* 9 | INDEX RANGE SCAN | TRAN_DET_IDX | 422K| | 897 (1)| 00:00:13 | 14 | 14 | | 10 | VIEW | | 1552 | 88464 | 9508 (1)| 00:02:14 | | | | 11 | WINDOW SORT | | 1552 | 69840 | 9508 (1)| 00:02:14 | | | | 12 | PARTITION RANGE SINGLE | | 1552 | 69840 | 9507 (1)| 00:02:14 | 7 | 7 | |* 13 | TABLE ACCESS FULL | TRANSACTION_DAILY | 1552 | 69840 | 9507 (1)| 00:02:14 | 7 | 7 | |* 14 | HASH JOIN ANTI | | 678 | 78648 | 24361 (1)| 00:05:42 | | | | 15 | VIEW | | 678 | 40002 | 14852 (1)| 00:03:28 | | | | 16 | WINDOW SORT | | 678 | 12882 | 14852 (1)| 00:03:28 | | | | 17 | PARTITION RANGE SINGLE | | 678 | 12882 | 14851 (1)| 00:03:28 | 14 | 14 | |* 18 | TABLE ACCESS BY LOCAL INDEX ROWID | TRANSACTION_DETAILS | 678 | 12882 | 14851 (1)| 00:03:28 | 14 | 14 | |* 19 | INDEX RANGE SCAN | TRAN_DET_IDX | 422K| | 897 (1)| 00:00:13 | 14 | 14 | | 20 | VIEW | | 1552 | 88464 | 9508 (1)| 00:02:14 | | | | 21 | WINDOW SORT | | 1552 | 69840 | 9508 (1)| 00:02:14 | | | | 22 | PARTITION RANGE SINGLE | | 1552 | 69840 | 9507 (1)| 00:02:14 | 7 | 7 | |* 23 | TABLE ACCESS FULL | TRANSACTION_DAILY | 1552 | 69840 | 9507 (1)| 00:02:14 | 7 | 7 | ----------------------------------------------------------------------------------------------------------------------------------The results were convincing and it still takes 4 seconds to fetch the data compared to 6 hours and 49 minutes. It is 6135 times faster than the original one. This query left End-Users rejoicing and he left my desk smiling but without answering my question, “Is the database very slow?” P.S.: I have renamed all the table names and columns names so as not to reveal official and sensitive information on my personal blog, yet preserving the reality. Regards
Friday, May 23, 2008
Automatic statistics gathering during Index Creation and Rebuilds
Dear all,
While testing dynamic sampling in Oracle 10g, I came to learn a 10g new feature (very late though).
My test case for dynamic sampling goes like this:
(a) Create table with data
(b) Create an index, and
(c) Execute the query and note the execution plan
Oracle® Database SQL Reference 10g Release 2 (10.2) Regards
SQL> Create table t as select * from scott.emp; Table created. SQL> Create index t_idx on t(empno); Index created. SQL> set autotrace traceonly explain SQL> Select * from t where empno = 7788; Execution Plan ---------------------------------------------------------- Plan hash value: 1020776977 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 87 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 87 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_IDX | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPNO"=7788) Note ----- - dynamic sampling used for this statement SQL> Set autotrace offYes, the optimizer did dynamic sampling and picked up the index plan as the optimal execution plan. Now, let me delete the statistics and re-run the same SQL.
SQL> Exec dbms_stats.delete_table_stats(user, 't'); PL/SQL procedure successfully completed. SQL> set autotrace traceonly explain SQL> Select * from t where empno = 7788; Execution Plan ---------------------------------------------------------- Plan hash value: 2153619298 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 87 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T | 1 | 87 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("EMPNO"=7788) Note ----- - dynamic sampling used for this statement SQL> Set autotrace offOops, optimizer did consider dynamic sampling but this time chose a Full Table Scan. How’ come the same optimizer picked index access plan on the first run? Ok, let me do it again and this time being very careful. (a) Create table with data
SQL> Drop table t purge; Table dropped. SQL> Create table t as select * from scott.emp; Table created. SQL> Select num_rows, last_analyzed from user_tables where table_name = 'T'; NUM_ROWS LAST_ANALYZED ---------- -------------- SQL>(b) Create an index, and
SQL> Create index t_idx on t(empno); Index created. SQL> column index_name format a10 SQL> set line 10000 SQL> select blevel, leaf_blocks, distinct_keys, clustering_factor, 2 num_rows, last_analyzed, user_stats, global_stats 3 from user_indexes 4 where index_name = 'T_IDX'; BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS LAST_ANALYZED USE GLO ---------- ----------- ------------- ----------------- ---------- -------------------- --- --- 0 1 14 1 14 05-May-2008 10:14:48 NO NO SQL>Oh! I did not gather statistics. (c) Execute the query and note the execution plan
SQL> set autotrace traceonly explain SQL> Select * from t where empno = 7788; Execution Plan ---------------------------------------------------------- Plan hash value: 1020776977 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 87 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 87 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_IDX | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPNO"=7788) Note ----- - dynamic sampling used for this statement SQL> Set autotrace offNow it’s clear why optimizer finds Index Access Path to be best plan. Let’s continue and delete the statistics and then execute the query again:
SQL> Exec dbms_stats.delete_table_stats(user, 't'); PL/SQL procedure successfully completed. SQL> select blevel, leaf_blocks, distinct_keys, clustering_factor, 2 num_rows, last_analyzed, user_stats, global_stats 3 from user_indexes 4 where index_name = 'T_IDX'; BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS LAST_ANALYZED USE GLO ---------- ----------- ------------- ----------------- ---------- -------------------- --- --- NO NO SQL> set autotrace traceonly explain SQL> Select * from t where empno = 7788; Execution Plan ---------------------------------------------------------- Plan hash value: 2153619298 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 87 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T | 1 | 87 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("EMPNO"=7788) Note ----- - dynamic sampling used for this statement SQL> Set autotrace offThis time the optimizer did dynamic sampling and FTS seems to be less expensive. In Oracle 10g, when an index is created, Oracle automatically gathers statistics on the newly created index. This is also true in case of index rebuild. Because of this reason, optimizer picked Index Path to be the best plan. References:
Oracle® Database SQL Reference 10g Release 2 (10.2) Regards
Monday, May 12, 2008
ORA-27061: waiting for async I/Os failed
Hi
One of our Physical Standby database aborted with "ORA-27061: waiting for async I/Os failed" error. The instance was killed by the DBWR process. Apparently, it was found to be an OS bug (IBM - AIX5L Based Systems) and an OS patch seems to resolve this issue.
Our current OS Patch level is AIX5.3 TL5-CSP and we need to upgrade it to either AIX5.3 TL6 SP4 or AIX5.3 TL7 to avoid this problem occurring in the future.
References:
Metalink Note Id: 467613.1 addresses this issue. Happy reading :-)
Metalink Note Id: 467613.1 addresses this issue. Happy reading :-)
Monday, May 05, 2008
Becareful when using DBMS_UTILITY to analyze
Hello,
If you are still using DBMS_UTILITY.ANALYZE_DATABASE or DBMS_UTILITY.ANALYZE_SCHEMA to analyze your database/schema's then you need to be very cautious.
I have observed a very strange behavior of this procedure against partitioned tables in one of our databases. Statistics are not being updated at table level. Although, partitions statistics are up to date.
The reason for incorrect statistics is:
If you use DBMS_STATS package to gather table statistics on a partitioned table and then later you use DBMS_UTILITY.ANALYZE_SCHEMA, table-level statistics are NOT updated, rather, statistics on partitions and indexes are modified.
This peculiar behavior is observed only with partitioned tables.
ANALYZE_SCHEMA procedure is obsolete and any one of us using this to gather statistics should seriously think of moving to DBMS_STATS package.
Following is a simple demo:
SQL> CREATE TABLE part_tab 2 (id NUMBER(5), 3 dt DATE) 4 PARTITION BY RANGE(dt) 5 ( 6 PARTITION part1_jan2008 VALUES LESS THAN(TO_DATE('01/02/2008','DD/MM/YYYY')), 7 PARTITION part2_feb2008 VALUES LESS THAN(TO_DATE('01/03/2008','DD/MM/YYYY')), 8 PARTITION part3_mar2008 VALUES LESS THAN(TO_DATE('01/04/2008','DD/MM/YYYY')), 9 PARTITION part4_apr2008 VALUES LESS THAN(TO_DATE('01/05/2008','DD/MM/YYYY')) 10 ); Table created. SQL> SQL> SQL> create table non_part_tab (id number, dt date); Table created. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB PART_TAB SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 PART1_JAN2008 PART3_MAR2008 PART4_APR2008 SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB 0 05-05-2008 00:05:43 PART_TAB 0 05-05-2008 00:05:43 SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 0 05-05-2008 00:05:43 PART1_JAN2008 0 05-05-2008 00:05:43 PART3_MAR2008 0 05-05-2008 00:05:43 PART4_APR2008 0 05-05-2008 00:05:43 SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB 0 05-05-2008 00:05:59 PART_TAB 0 05-05-2008 00:05:59 SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 0 05-05-2008 00:05:59 PART1_JAN2008 0 05-05-2008 00:05:59 PART3_MAR2008 0 05-05-2008 00:05:59 PART4_APR2008 0 05-05-2008 00:05:59 SQL> exec dbms_stats.gather_schema_stats('TEST'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB 0 05-05-2008 00:06:11 PART_TAB 0 05-05-2008 00:06:11 SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 0 05-05-2008 00:06:11 PART1_JAN2008 0 05-05-2008 00:06:11 PART3_MAR2008 0 05-05-2008 00:06:11 PART4_APR2008 0 05-05-2008 00:06:11 SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB 0 05-05-2008 00:06:23 PART_TAB 0 05-05-2008 00:06:11 ---> Statistics are NOT updated. SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 0 05-05-2008 00:06:23 PART1_JAN2008 0 05-05-2008 00:06:23 PART3_MAR2008 0 05-05-2008 00:06:23 PART4_APR2008 0 05-05-2008 00:06:23 SQL> exec dbms_utility.analyze_schema('TEST', 'COMPUTE'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB 0 05-05-2008 00:08:53 PART_TAB 0 05-05-2008 00:06:11 ---> Statistics are NOT updated. SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 0 05-05-2008 00:08:53 PART1_JAN2008 0 05-05-2008 00:08:53 PART3_MAR2008 0 05-05-2008 00:08:53 PART4_APR2008 0 05-05-2008 00:08:53 SQL> exec dbms_stats.gather_schema_stats('TEST'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, last_analyzed from user_tables ; TABLE_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- NON_PART_TAB 0 05-05-2008 00:09:05 PART_TAB 0 05-05-2008 00:09:05 SQL> select partition_name, num_rows, last_analyzed from user_tab_partitions; PARTITION_NAME NUM_ROWS LAST_ANALYZED ------------------------------ ---------- ------------------- PART2_FEB2008 0 05-05-2008 00:09:05 PART1_JAN2008 0 05-05-2008 00:09:05 PART3_MAR2008 0 05-05-2008 00:09:05 PART4_APR2008 0 05-05-2008 00:09:05 SQL>Useful references: 1)How to Move from ANALYZE (using DBMS_UTILITY) to DBMS_STATS (Note: 237397.1) 2) Oracle® Database PL/SQL Packages and Types Reference 10g Release 2 (10.2) Happy reading !!!
Thursday, May 01, 2008
Small Change and A Huge Gain
Hi,
Elapsed time of one of our data warehouse procedure was 17 minutes on an average. Following is the skeleton procedure:
Create or replace procedure Update_Customers Is Cursor Cust_Cur Is Select * From Stg_Customers; l_Cust_Value Customers.Cust_Value; Begin For Cust_Rec In Cust_Cur Loop : : Select Value Into l_Cust_Value From Customers Where Cust_Code = Cust_Rec.Cust_Code And Cust_Key = Cust_Rec.Cust_Key; : : End Loop; End Update_Customers;Upon taking snaps before and after executing the procedure it was evident that most of the time was consumed by the "SELECT...FROM Customers...." statement. Elapsed Time from AWR report: Procedure: 1021 Seconds SQL Statement: 925 Seconds There is a Composite Index on "Cust_Code" and "Cust_Key" columns. When I ran the same statement in SQL*Plus, it was fetching results very fast using the appropriate index. STG_CUSTOMERS is a staging table which consists of nearly 250,000 records. Data is daily purged and populated in this table. CUSTOMERS table was probed for 250,000 times in the loop, individual query execution was very fast but repeated executions within the loop were causing the query to consume more time. I replaced the original cursor by joining CUSTOMERS and STG_CUSTOMERS tables as shown below:
Create or replace procedure Update_Customers Is Cursor Cust_Cur Is Select * From Stg_Customers A, Customers b Where a.cust_code = b.cust_code And a.cust_key = b.cust_key; l_Cust_Value Customers.Cust_Value; Begin For Cust_Rec In Cust_Cur Loop : : : End Loop; End Update_Customers;When this modified procedure was executed, the performance was remarkably improved and the elapsed time dropped to only less than 75 seconds. Below is the elapsed time of the same procedure before and after modification:
TYPE DATE Elapsed(Min) CPU(Min) --------------- ----------- --------------- --------------- Procedure 19-Apr-2008 17.70 1.87 SQL Statement 19-Apr-2008 15.98 13.15 Procedure 20-Apr-2008 17.67 1.80 SQL Statement 20-Apr-2008 16.05 12.95 Procedure 21-Apr-2008 16.93 1.82 SQL Statement 21-Apr-2008 15.35 12.85 Procedure 22-Apr-2008 16.68 1.78 SQL Statement 22-Apr-2008 15.08 12.42 Procedure 23-Apr-2008 16.38 1.78 SQL Statement 23-Apr-2008 14.77 12.43 Procedure 24-Apr-2008 1.15 .92 Procedure 25-Apr-2008 1.13 .92 Procedure 26-Apr-2008 1.20 .93 Procedure 27-Apr-2008 1.23 .93Regards
Subscribe to:
Posts (Atom)