Displaying the hex value of a string from a oracle varchar2?

Question 1

Use the dump function to see how Oracle stores data internally.

You seem to have a misunderstanding on how Oracle treats VARCHAR2 characters set conversions: you can't influence how Oracle stores its data physically. (Also if you haven't already, it's helpful to read: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets).

Your client speaks to Oracle only in binary. In fact all systems exchange information in binary only. To understand each others, it is necessary that both systems know what language (character set) is being used.

In your case we can reconstruct what happens:

Your client sends the byte dd to Oracle and says it is windows-1252 (instead of 1254).
Oracle looks up its character set table and sees that this data is translated to the symbol Ý in this character set.
Oracle logically stores this information in its table.
Since Oracle is setup in UTF-8, it converts this data to the UTF-8 binary reprensentation of Ý:
```
SQL> SELECT rawtohex('Ý') FROM dual;

RAWTOHEX('Ý')
--------------
C39D
```
Oracle stores C39D internally.

As you can see, the problem comes from the first step: there is a problem of setup. As long as you don't fix this, the systems won't be able to successfully dialogue.

The conversion is automatic when you use VARCHAR2 because this datatype is a logical text symbol interface (you have next to no control over forcing the actual binary data being stored).

Question 2

I have bytes in UTF-8 to begin.

String strFromUTF8 = new String(bytes, "UTF8");
byte[] strInOldStyle = strFromUTF8.getBytes("Cp1254");

With MySQL, I am done. I takes these bytes, turn them into a hex string and do an update with unhex(hexStr). This allows me to put the legacy bytes into a varchar column.

With Oracle, I must do:

String again = new String(strInOldStyle, "Cp1254");
byte[] nextOldBytes = again.getBytes("UTF8");

Now, I can do an update and get the bytes into a varchar2 column with:

update table set colName = UTL_RAW.CAST_TO_VARCHAR2(HEXTORAW('hexStr')) where ...

Strange, no? I am sure I have made this more complex than it needed to be.

What we see is this, though,

"İ" in UTF-8 == 0xc4d0
"İ" in Cp1254 == 0xdd == "Ý" in Cp1252
"Ý" in UTF-8 == 0xc3d9

So, if I get the string "İ" and do:

update table set name = UTL_RAW.CAST_TO_VARCHAR2(HEXTORAW('C3D9')) where ...

Then our legacy client gives us a "İ". Yep. It works.