Sunday, January 20, 2008

Performance Comparison of Different Datatypes

Dear readers,

Oracle database has a rich collection of datatypes and it offers different datatypes for different needs. Basically, datatype can be either scalar or non-scalar. A scalar type contains an atomic value, whereas a non-scalar contains a set of values. Examples of scalar datatypes include number, varchar2, etc, while that of non-scalar could be a collection.

Apart from the rich collection of data types available in Oracle database, PL/SQL offers few more datatypes like BINARY_INTEGER and PLS_INTEGER. These datatypes can be used within a PL/SQL block.

Often, these datatypes provide better performance over other and one should use them where appropriate.

I picked up couple of datatypes and performed 100,000,000 iterations to compare processing time of these datatypes. Amazingly there was a huge difference of processing time between datatypes. For some of the datatypes the processing time was 10 times lower than their counterparts.

Below is the script and its output, which I ran against Oracle Database 10g Release 2 (10.2.0.3).

set serveroutput on

declare
  l_time number;

  l_bi   BINARY_INTEGER;
  l_pi   PLS_INTEGER;
  l_bf   BINARY_FLOAT;
  l_bd   BINARY_DOUBLE;
  l_ntn  NATURALN := 0;

  l_num  NUMBER;
  l_int1 INTEGER;
  l_int2 INT;
  l_sint SMALLINT;
  l_dec1 DECIMAL;
  l_dec2 DEC;
  l_real REAL;
  l_flt  FLOAT;
  l_nrc  NUMERIC;
  l_dpr  DOUBLE PRECISION;

begin
  l_time := dbms_utility.get_time;
  dbms_output.put_line(chr(10)  chr(10) 
      'Time taken for 100,000,000 iterations for : '  chr(10)  chr(10) );

  for i in 0..99999999 loop
    l_bi := i;
  end loop;
  dbms_output.put_line(rpad('BINARY_INTEGER', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_pi := i;
  end loop;
  dbms_output.put_line(rpad('PLS_INTEGER', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_ntn := i;
  end loop;
  dbms_output.put_line(rpad('NATURALN', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_num := i;
  end loop;
  dbms_output.put_line(rpad('NUMBER', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_int1 := i;
  end loop;
  dbms_output.put_line(rpad('INTEGER', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_int2 := i;
  end loop;
  dbms_output.put_line(rpad('INT', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_sint := i;
  end loop;
  dbms_output.put_line(rpad('SMALLINT', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_dec1 := i;
  end loop;
  dbms_output.put_line(rpad('DECIMAL', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_dec2 := i;
  end loop;
  dbms_output.put_line(rpad('DEC', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_real := i;
  end loop;
  dbms_output.put_line(rpad('REAL', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_flt := i;
  end loop;
  dbms_output.put_line(rpad('FLOAT', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_nrc := i;
  end loop;
  dbms_output.put_line(rpad('NUMERIC', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

  l_time := dbms_utility.get_time;
  for i in 0..99999999 loop
    l_dpr := i;
  end loop;
  dbms_output.put_line(rpad('DOUBLE PRECISION', 20, ' ')
                   ' = '  lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));

end;

Time taken for 100,000,000 iterations for :


BINARY_INTEGER......  172
PLS_INTEGER.........  189
BINARY_FLOAT........  611
BINARY_DOUBLE.......  473
NATURALN............  650
NUMBER.............. 1047
INTEGER............. 1839
INT................. 1841
SMALLINT............ 1842
DECIMAL............. 1933
DEC................. 1900
REAL................ 1051
FLOAT............... 1047
NUMERIC............. 1847
DOUBLE PRECISION.... 1055

PL/SQL procedure successfully completed.

As you may notice, PLS_INTEGER and BINARY_INTEGER datatypes are nearly 7 times faster than popularly used NUMBER datatype. This is because, these datatypes require less storage and they use hardware arithmetic whereas, NUMBER and INTEGER variables require calls to library routines.

Moreover, Oracle Documentation requests PL/SQL developers to avoid using INTEGER and NATURALN datatypes where performance is critical. According to the documentation, variables of these types require extra checking at run time, each time they are used in a calculation.

So, use of right datatype will really pay off in terms of performance, thus, better choose the right datatype for your operation.

References:

For more information on these datatypes and their magnitude ranges please refer to Oracle Documentation:

Use PLS_INTEGER for Integer Arithmetic.

Use BINARY_FLOAT and BINARY_DOUBLE for Floating-Point Arithmetic.

Oracle Database 10g Release 2: SQL Reference.

Oracle Database 10g Release 2: PL/SQL User’s Guide and Reference.

Happy reading !!!

5 comments:

Anonymous said...

What you are measuring here is the time that it takes to assign a value of the type used for the index of a for loop to each to the data types. My guess is that the seven to ten times faster you are seeing is because the data type of the index is the same as the data type of the assigned variable. If you really want a comparison of the performance of the different data type, you will need to actually do something with them.

Asif Momen said...

Hi Scott,

Thanks for commenting.

Based on your comments, I have modified my testcase and here are the details:

> My guess is that the seven to
> ten times faster you are seeing
> is because the data type of the
> index is the same as the data
> type of the assigned variable.

I modified datatypes to be of same type, both index datatype and the datatype of the assinged variable.

> If you really want a comparison
> of the performance of the
> different data type, you will
> need to actually do something
> with them.

I did an assignment, addition & subtraction, in short little arithmetic with the numbers.

SQL> set serveroutput on
SQL>
SQL> declare
2 l_time number;
3
4 l_bi BINARY_INTEGER;
5 l_bi_idx BINARY_INTEGER;
6 l_pi PLS_INTEGER;
7 l_pi_idx PLS_INTEGER;
8
9 l_num NUMBER;
10 l_num_idx NUMBER;
11 l_int INTEGER;
12 l_int_idx INTEGER;
13 begin
14 l_time := dbms_utility.get_time;
15 dbms_output.put_line(chr(10) || chr(10) ||
16 'Time taken for 100,000,000 iterations for : ' || chr(10) || chr(10) );
17
18 l_bi_idx := 99999999;
19 l_pi_idx := 99999999;
20 l_num_idx := 99999999;
21 l_int_idx := 99999999;
22
23 for i in 0..l_bi_idx loop
24 l_bi := i; -- Simple assigment
25 l_bi := l_bi + 50; -- Addition
26 l_bi := l_bi - 30; -- Subtraction
27 end loop;
28 dbms_output.put_line(rpad('BINARY_INTEGER', 20, ' ')
29 || ' = ' || lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));
30
31 for i in 0..l_pi_idx loop
32 l_pi := i; -- Simple assigment
33 l_pi := l_pi + 50; -- Addition
34 l_pi := l_pi - 30; -- Subtraction
35 end loop;
36 dbms_output.put_line(rpad('PLS_INTEGER', 20, ' ')
37 || ' = ' || lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));
38
39 for i in 0..l_num_idx loop
40 l_num := i; -- Simple assigment
41 l_num := l_num + 50; -- Addition
42 l_num := l_num - 30; -- Subtraction
43 end loop;
44 dbms_output.put_line(rpad('NUMBER', 20, ' ')
45 || ' = ' || lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));
46
47 for i in 0..l_int_idx loop
48 l_int := i; -- Simple assigment
49 l_int := l_int + 50; -- Addition
50 l_int := l_int - 30; -- Subtraction
51 end loop;
52 dbms_output.put_line(rpad('INTEGER', 20, ' ')
53 || ' = ' || lpad(to_char(dbms_utility.get_time - l_time), 4, ' '));
54 end;
55 /


Time taken for 100,000,000 iterations for :


BINARY_INTEGER = 523
PLS_INTEGER = 1053
NUMBER = 3404
INTEGER = 7982

PL/SQL procedure successfully completed.

Elapsed: 00:01:19.88
SQL>

If you notice, BINARY_INTEGER and PLS_INTEGER are again winning over their counterparts in terms of performance. BINARY_INTEGER still performs 6-7 times faster than NUMBER datatype whereas PLS_INTEGER is over 3 times.

Regards

Anonymous said...

Your welcome.

That looks a lot better. Out of curiosity I did a similar run on my 11g instance and found that binary_integer and pls_integer were equivalent with number 3.9 and integer 9.8 times higher. The affect of adding a not null constraint surprised me: pls_integer and binary_integer stayed the same while number and integer shot up to 7.5 and 11 times higher respectively.

On another note, I miss the food over there terribly. Nothing over here stacks up to a curbside swarma in Al Kobar

Asif Momen said...

Yes, PLS_INTEGER & BINARY_INTEGER have improved performance over other numeric datatypes. But unfortunately these are ignored and you will see Number datatype is more popularly practised.

Anonymous said...

Very nice and useful article..