Question

I am using Data Transfer utility for IBM i in order to create TSV files from my AS400s and then import them to my SQl Server Data Warehouse.

Following this: SO Question about SSIS encoding script i want to stop using conversion in SSIS task and have the data ready from the source.

I have tried using vatious codepages in TSV creation (1200 etc.) but 1208 only does the trick in half: It creates UTF8 which then i have to convert to unicode as shown in the other question.

What CCSID i have to use to get unicode from the start?

Utility Screenshot:

enter image description here

No correct solution

OTHER TIPS

On IBM i, CCSID support is intended to be seamless. Imagine the situation where the table is in German encoding, your job is in English and you are creating a new table in French - all on a system whose default encoding is Chinese. Use the appropriate CCSID for each of these and the operating system will do the character encoding conversion for you.

Unfortunately, many midrange systems aren't configured properly. Their system default CCSID is 'no CCSID / binary' - a remnant of a time some 20 years ago, before CCSID support. DSPSYSVAL QCCSID will tell you what the default CCSID is for your system. If it's 65535, that's 'binary'. This causes no end of problems, because the operating system can't figure out what the true character encoding is. Because CCSID(65535) was set for many years, almost all the tables on the system have this encoding. All the jobs on the system run under this encoding. When everything on the system is 65535, then the OS doesn't need to do any character conversion, and all seems well.

Then, someone needs multi-byte characters. It might be an Asian language, or as in your case, Unicode. If the system as a whole is 'binary / no conversion' it can be very frustrating because, essentially, the system admins have lied to the operating system with respect to the character encoding that is effect for the database and jobs.

I'm guessing that you are dealing with a CCSID(65535) environment. I think you are going to have to request some changes. At the very least, create a new / work table using an appropriate CCSID like EBCDIC US English (37). Use a system utility like CPYF to populate this table. Now try to download that, using a CCSID of say, 13488. If that does what you need, then perhaps all you need is an intermediate table to pass your data through.

Ultimately, the right solution is a proper CCSID configuration. Have the admins set the QCCSID system value and consider changing the encoding on the existing tables. After that, the system will handle multiple encodings seamlessly, as intended.

The CCSID on IBM i called 13488 is Unicode type UCS-2 (UTF-16 Big Endian). There is not "one unicode" - there are several types of Unicode formats. I looked at your other question. 1208 is also Unicode UTF-8. So what exactly is meant "to get Unicode to begin with" is not clear (you are getting Unicode to begin with in format UTF-8) -- but then I read your other question and the function you mention does not say what kind of "unicode" it expects :

using (StreamWriter writer = new StreamWriter(to, false, Encoding.Unicode, 1000000))  

The operating system on IBM i default is to mainly store data in EBCDIC database tables, and there are some rare applications that are built on this system to use Unicode natively. It will translate the data into whatever type of Unicode it supports.

As for SQL Server and Java - I am fairly sure they use UCS-2 type Unicode so if you try using CCSID 13488 on the AS/400 side to transfer, it may let you avoid the extra conversion from UTF-8 Unicode because CCSID 13488 is UCS-2 style Unicode.

https://www-01.ibm.com/software/globalization/ccsid/ccsid_registered.html

There are 2 CCSID's for UTF-8 unicode on system i 1208 and 1209. 1208 is UTF-8 with IBM PAU 1209 is for UTF-8. See link above.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top