How to stop git from breaking encoding on checkout

https://stackoverflow.com/questions/13704936

04-12-2021
|

سؤال

I recently added a .gitattributes file to a c# repository with the following settings:

*            text=auto
*.cs         text diff=csharp

I renormalized the repository following these instructions from github and it seemed to work OK.

The problem I have is when I checkout some files (not all of them) I see lots of weird characters mixed in with the actual code. It seems to happen when git runs the files through the lf->crlf conversion specified by the .gitattributes file above.

According to Notepad++ the files that get messed up are using UCS-2 Little Endian or UCS-2 Big Endian encoding. The files that seem to work OK are either ANSI or UTF-8 encoded.

For reference my git version is 1.8.0.msysgit.0 and my OS is Windows 8.

Any ideas how I can fix this? Would changing the encoding of the files be enough?

المحلول

This happens if you use an encoding where every character is two bytes.
CRLF would then be encoded as \0\r\0\n.

Git thinks it's a single-byte encoding, so it turns that into \0\r\0\r\n.
This makes the next line one byte off, causing every other line be full of Chinese. (because the \0 becomes the low-order byte rather than the high-order byte)

You can convert files to UTF8 using this LINQPad script:

const string path = @"C:\...";
foreach (var file in Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories))
{
    if (!new [] { ".html", ".js"}.Contains(Path.GetExtension(file)))
        continue;
    File.WriteAllText(file, String.Join("\r\n", File.ReadAllLines(file)), new UTF8Encoding(encoderShouldEmitUTF8Identifier: true));
    file.Dump();
}

This will not fix broken files; you can fix the files by replacing \r\n with \n in a hex editor. I don't have a LINQPad script for that. (since there's no simple Replace() method for byte[]s)

نصائح أخرى

To fix this, either convert the encoding of the files (UTF-8 should be ok) or disable the line break auto conversion (git config core.autocrlf false and .gitattributes stuff you have).

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow