Question

I have a file that has multiple commits in my Git repository, that is encoding in 16-bit Unicode (UCS-2), that is used by Windows.

Because of that, Git considers it a binary file, instead of a text file, and I can't see the changes that different commits made.

Is there a way to retroactively convert that file to UTF-8, i.e. rebuild the history, as if the file had always been UTF-8, and I had always been commiting it as a UTF-8 file, not a 16-bit Unicode file?

Was it helpful?

Solution

To retroactively recode a file, use git filter-branch:

git filter-branch --tree-filter 'recode utf-16..utf-8 file'

If you don't have recode, use the longer iconv -f utf-16 -t utf-8 file -o file instead. If the file doesn't exist in earlier versions of the tree, you need probably want to append || true so the recoding command doesn't fail, and optionally suppress error output.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top