Tuesday, October 14, 2008

How to read Data between different Characterset Databases

We have two 24x7 production databases with different charactersets. Yes, both charactersets are related to Arabic language. Database A’s characterset is AR8MSWIN1256 while database B’s characterset is AR8ISO8859P6.

We had a requirement wherein database A will read Arabic data from database B. If you create a database link and try to query data from database B it will appear to be garbage, as shown below:

whereas when you run the same query on database B, this is how the output looks:

Even you are unaware of Arabic language, you may visually compare and see the output is a garbage when queried from database A.

DUMP function comes handy under these situations. It returns internal representation of an expression. When used to against the “name” failed, I get this output.

SQL> select dump(name) from test_iso@testdb;

DUMP(NAME)
--------------------------------------------------------------------
Typ=1 Len=13: 227,205,227,207,32,194,213,221,32,227,196,227,228

where, “Typ” refers to datatype, “Len” means the length of the string, and everything after “:” is comma separated ASCII values of all the characters in the “name” field.

I wrote a small procedure where convert all the comma separated ASCII values to character string.

create or replace function read_name (p_id in number) return varchar2 is
  t_str varchar2(2000);
  l_name varchar2(2000);
  l_output varchar2(2000);
begin
  select 'select '||'chr('||replace(substr(dump(name, 1010), instr(dump(name, 1010), ':')+2), ',', ')|| chr(')||') from dual' 
    into t_str from test_iso@testdb
   where id = p_id;
 
--  dbms_output.put_line(t_str);
  execute immediate t_str into l_output;  
  return l_output;
end;
/

Now upon using this function, I am able to query remote table in a readable format.

Here’s the output: