Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retroactively convert a UCS-2 file to UTF-8 in Git

Tags:

git

utf-8

ucs2

I have a file that has multiple commits in my Git repository, that is encoding in 16-bit Unicode (UCS-2), that is used by Windows.

Because of that, Git considers it a binary file, instead of a text file, and I can't see the changes that different commits made.

Is there a way to retroactively convert that file to UTF-8, i.e. rebuild the history, as if the file had always been UTF-8, and I had always been commiting it as a UTF-8 file, not a 16-bit Unicode file?

like image 814
sashoalm Avatar asked Sep 19 '13 12:09

sashoalm


Video Answer


1 Answers

To retroactively recode a file, use git filter-branch:

git filter-branch --tree-filter 'recode utf-16..utf-8 file'

If you don't have recode, use the longer iconv -f utf-16 -t utf-8 file -o file instead. If the file doesn't exist in earlier versions of the tree, you need probably want to append || true so the recoding command doesn't fail, and optionally suppress error output.

like image 170
user4815162342 Avatar answered Sep 20 '22 02:09

user4815162342