^M on Windows Linux via subversion

https://stackoverflow.com/questions/12209093

29-06-2021
|

题

We have written a patch to replace ^M character supplied in the data file using a shell script;

sed 's/^M//g' source_file > target_file

but since we use subversion for source control of shell scripts and I am already specifying eol-style:native property; this ^M was replaced with new line when we take svn update on UNIX box and becomes

sed 's/
//g' source_file > target_file

As a better practice I recommended to replace this sed with dos2unix

dos2unix source_file > target_file

this got rid of ^M character but as a side effect it also replaced some meaningful data available in source_file which shouldn't be converted.

So we want a way to get rid of ^M character from data file via a shell script which shouldn't mention ^M character so that it can be ported Windows and Linux machine via subversion?

What is the best practice to get rid of such issues?

解决方案

What you are seeing is someone editing a file in Windows -- probably using Notepad -- and committing the file into your Subversion repository. This adds on the line endings which screws up Makefiles and shell scripts.

Fortunately, a good program editor (that is, not Notepad) can understand that line endings on various files are different, and can preserve and even convert line endings. This means that someone who is working on a Windows machine writing Unix shell scrips or Makefiles can, in theory eliminate the ^M you're seeing. I urge developers to use IDEs like Eclipse that handle this issue, or at least use a program editor like VIM or Notepad++, but many still like using Notepad and screw everything up.

What you have to do is give the developers the proper incentive to use the correct programming environment and stop messing up the files.

Here are a couple of recommendations.

You can use wire up high voltage lines to all of your developer's chairs, and immediately give them a 1,000 volt shock whenever they use Notepad to edit a file.
You can use Subversion's built in mechanism to handle line ending on these files.

Although the first approach is very tempting, I highly recommend the latter approach. Subversion has a property called svn:eol-style that can force automatically create the correct line ending upon a file. For example, if I set snv:eol-style to LF, the file will always have the correct line-feed line endings when it is committed or checked out. This way, you don't have to do any post-processing to remove these line endings. Problem solved.

The only issue is one of enforcement. When a developer creates a new file or edits an old one, they need to also set the property svn:eol-style to the correct value. There is an auto-prop mechanism in Subversion that can do this, but there is no way for you to make sure developers use it.

I use a pre-commit hook that can refuse to allow files to be committed if they don't have this property attached to them. You should be able to setup this hook script, so that only files that require this type of line ending (Unix scripts, Makefiles, etc.) while others that don't aren't affected (Java source code, XML, etc.).

My pre-commit hook is fairly easy to setup and use. You use a control file to setup what you need. For example:

[PROPERTY All Unix Scripts must have "svn:eol-style" set to "LF"]
match = .\(sh|pl|py|ksh|csh)$
property = svn:eol-style
value = LF
type = string

[PROPERTY All Makefiles must have "svn:eol-style" set to "LF"]
match = [Mm]akefile
property svn:eol-style
value = LF
type = string

This will ensure that the developers check in the files with the correct line endings in the first place, so you don't have to run post-processing scripts on them. This can greatly simplify your deployment process and eliminate one of the largest causes of errors.

其他提示

What about sed 's/\r$//' ? use the \r sequence to denote carriage return, and only remove them just before a newline.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow