Question

Morning All -

I have a problem and the solution has alluded me for a few days now.

I have an SSIS package that performs the following. 1. Run SQL script 2. Export results to flat file (UTF-8 Encoded, ; delimitated, and \n for new lines) 3. FTP results to solaris machine ( binary format )

The problem is, that when the file is shows up on my solaris box it has the following at the start of the file.

\377\376

I have tried dos2unix and still has not corrected the issue. In fact it changes the \377\376 to \227\226, not very helpful.

My question, any way to remove these characters from my file? When they are there they mess with grep and other unix tools like head.

Any help is appreciated.

Thanks

Was it helpful?

Solution

This was an easy fix that eluded me for a few days. Thought I would answer my own question incase anyone is searching for an answer.

By default any SSIS or windows encoded file is UCS-2-LITTLE-ENDIAN encoded. The easiest way is to encode the file on your unix server with the following commands.

  1. Switch over to UTF-8 (or whatever encoding you need) with iconv

    iconv -f UCS-2-LITTLE-EDIAN -t UTF-8 input > output

  2. Remove the carriage returns that ms adds to the end of liens.

    unix2dos -ascii utf-8-file outputfile

And that will solve your problem.

OTHER TIPS

Dos2unix version 6.0 and higher can convert Windows Unicode UTF-16 files to Unix UTF-8. It will also remove the Byte Order Mark (BOM). Get the latest dos2unix here

There is a Windows version available.

As the previews answers stated, using dos2unix made the job, in my case I used:

dos2unix.exe -r -v -f -D utf8 <FileName>

in which:

-r, --remove-bom remove Byte Order Mark (default)

-v, --verbose verbose operation

-f, --force force conversion of binary files

-D, --display-enc set encoding of displayed text messages encoding ansi, unicode, utf8, default to ansi

And the BOM char was removed

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top