Question

I need to export an Oracle database from one server and import it on another server. But I am getting the following information message at the beginning:

Connected to: Oracle Database 10g Release 10.2.0.4.0 - 64bit Production Export done in US7ASCII character set and AL16UTF16 NCHAR character set server uses UTF8 character set (possible charset conversion)

Although the export still terminates successfully without warning, I was wondering whether the data in the database is still correct. Is there a need to set the NLS_LANG variable to match the character set of the server (in this case UTF-8)?

If I set the NLS_LANG variable to UTF-8 I don't get this information message. But the target server where I import the database again shows the following:

import done in UTF8 character set and AL16UTF16 NCHAR character set import server uses AL32UTF8 character set (possible charset conversion)

If I understand that correctly these two servers use (slightly) different character sets - UTF-8 and AL32UTF8.

I am confused on what I should set the NLS_LANG variable or even whether it is necessary to set. Should I use AL32UTF8 as NLS_LANG for the export and import?

EDIT: In case this information is important: I am using a SUSE Linux and for the export I use the command exp that comes with Oracle.

Was it helpful?

Solution

First of all, character set UTF-8 does not exist on Oracle, use AL32UTF8 or UTF8 (without the hyphen).

Usually you should get an error when your client character set is UTF-8:

$ setenv NLS_LANG AMERICAN_AMERICA.UTF-8
$ sqlplus ...

ERROR:
ORA-12705: Cannot access NLS data files or invalid environment specified

Character set UTF8 is identical to AL32UTF8 for characters of BMP (Basic Multilingual Plane), i.e. the first 65536 characters of Unicode. Most likely all characters in your application fall in this range, so it does not make any difference. AL32UTF8 is what is commonly called UTF-8 encoding. Oracle character set UTF8 is commonly known as CESU-8.

If you do not set any NLS_LANG variable then Oracle defaults it to AMERICAN_AMERICA.US7ASCII. Setting your client character set to US7ASCII is like you tell the Oracle Database server: "My client is able to display (only) 7-bit ASCII characters and files are saved in 7-bit ASCII format." Thus the database will replace all russian/cyrillic characters to ? or ¿ at export.

Set your NLS_LANG to AL32UTF8 (no matter which database character sets you have in place), then you will get any characters properly exported and imported.

For more information check also this document: NLS_LANG FAQ

OTHER TIPS

The NLS_LANG parameter is used by the Oracle network layer to do character translation between the client and the database. This way the client can display all the characters in the database in a 'correct' way.

Your NLS_LANG should match the character-set of your database when you do an export. If you export a UTF-8 database in US7ASCII then you risk to loose information when you import this data back into a database. Why? The UTF-8 character-set can hold much more different characters then US7ASCII. The 'unknown' characters are replaced by a single character. In the past I have seen this happening that we lost all characters with accents. They were replaced by a '?' if I remember well.

If you import then you must be sure that the NLS_LANG is set to the value of your import file. Oracle will then do the mapping to the character-set of the database in which you import.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top