'^M' character at end of lines
-
09-06-2019 - |
Question
When I run a particular SQL script in Unix environments, I'm am seeing a '^M' character at the end of each line of the SQL script as it is echoed to the command-line. I don't know on which OS the SQL script was originally created.
What is causing this and how do I fix it?
Solution
It's caused by the DOS/Windows line-ending characters. Like Andy Whitfield said, the Unix command dos2unix will help fix the problem. If you want more information, you can read the man pages for that command.
OTHER TIPS
Fix line endings in vi
by running the following:
:set fileformat=unix
:w
The cause is the difference between how a Windows-based based OS and a Unix based OS store the end-of-line markers.
Windows based operating systems, thanks to their DOS heritage, store an end-of-line as a pair of characters - 0x0D0A
(carriage return + line feed). Unix-based operating systems just use 0x0A
(a line feed). The ^M
you're seeing is a visual representation of 0x0D
(a carriage return).
dos2unix will help with this. You probably also need to adjust the source of the scripts to be 'Unix-friendly'.
The easiest way is to use vi
. I know that sounds terrible but its simple and already installed on most UNIX environments. The ^M is a new line from Windows/DOS environment.
from the command prompt: $ vi filename
Then press ":
" to get to command mode.
Search and Replace all Globally is :%s/^M//g
"Press and hold control then press V then
M" which will replace ^M with nothing.
Then to write and quit enter ":wq
" Done!
Try using dos2unix to strip off the ^M.
In vi, do a :%s/^M//g
To get the ^M
hold the CTRL key, press V then M (Both while holding the control key) and the ^M
will appear. This will find all occurrences and replace them with nothing.
The SQL script was originally created on a Windows OS. The '^M' characters are a result of Windows and Unix having different ideas about what to use for an end-of-line character. You can use perl at the command line to fix this.
perl -pie 's/\r//g' filename.txt
The ^M is typically caused by the Windows operator newlines, and translated onto Unix looks like a ^M. The command dos2unix should remove them nicely
dos2unix [options] [-c convmode] [-o file ...] [-n infile outfile ...]
C:\tmp\text>dos2unix hello.txt helloUNIX.txt
Sed is even more widely available and can do this kind of thing also if dos2unix is not installed
C:\tmp\text>sed s/\r// hello.txt > helloUNIX.txt
You could also try tr:
cat hello.txt | tr -d \r > helloUNIX2.txt
Here are the results:
C:\tmp\text>dumphex hello.txt
00000000h: 48 61 68 61 0D 0A 68 61 68 61 0D 0A 68 61 68 61 Haha..haha..haha
00000010h: 0D 0A 0D 0A 68 61 68 61 0D 0A ....haha..
C:\tmp\text>dumphex helloUNIX.txt
00000000h: 48 61 68 61 0A 68 61 68 61 0A 68 61 68 61 0A 0A Haha.haha.haha..
00000010h: 68 61 68 61 0A haha.
C:\tmp\text>dumphex helloUNIX2.txt
00000000h: 48 61 68 61 0A 68 61 68 61 0A 68 61 68 61 0A 0A Haha.haha.haha..
00000010h: 68 61 68 61 0A haha.
To replace ^M characters in vi editor use below
open the text file say t1.txt
vi t1.txt
Enter command mode by pressing shift + :
then press keys as mentioned %s/^M/\r/g
in above ^M is not (shift + 6)M instead it is (ctrl + V)(ctrl + M)
An alternative to dos2unix
command would be using standard utilities like sed
.
For example, dos to unix:
sed 's/\r$//' dos.txt > unix.txt
unix to dos:
sed 's/$/\r/' unix.txt > dos.txt
You can remove ^M from the files directly via sed command, e.g.:
sed -i'.bak' s/\r//g *.*
If you're happy with the changes, remove the .bak files:
rm -v *.bak
Convert DOS/Windows (\r\n) line endings to Unix (\n) line endings, with tr:
tr '\r\n' '\n' < dosFile.txt > unixFile.txt
od -a $file
is useful to explore those types of question on Linux (similar to dumphex in the above).
In Perl, if you don't want to set the $/ variable and use chomp() you can also do:
$var =~ /\r\n//g;
My two cents
Another vi command that'll do: :%s/.$//
This removes the last character of each line in the file. The drawback to this search and replace command is that it doesn't care what the last character is, so be careful not to call it twice.