Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GitHub: Using UTF-8 encoding for files

Why isn't UTF-8 the default encoding for GitHub?

Does that mean that there are any drawbacks to change from the default "cp1252"?

Does it have anything to do with using GitHub across platforms between Windows and Mac OSX?

The question is asked from using GitHub, but can be asked about Git in general.

I'm mostly doing development in .NET, HTML5 and Javascript, if that matters in the subject.

like image 825
Seb Nilsson Avatar asked Nov 07 '11 15:11

Seb Nilsson


People also ask

Are .txt files UTF-8?

Most Microsoft Windows text files use "ANSI", "OEM", "Unicode" or "UTF-8" encoding.

Should I use UTF-8 or UTF-16?

UTF-16 is, obviously, more efficient for A) characters for which UTF-16 requires fewer bytes to encode than does UTF-8. UTF-8 is, obviously, more efficient for B) characters for which UTF-8 requires fewer bytes to encode than does UTF-16.


1 Answers

Wild guess: are you using TortoiseGit? Is that were you're seeing a default encoding set to cp1252?

If so, it's simply TortoiseGit using the default encoding of your Windows installation.

Edit: Exactly the same is true for the Git GUI

Here's a discussion from a git developer's mailing list giving an explanation:

  • Make diffs and blame default to the system (locale) encoding instead of hard-coding UTF-8.
  • Add a gui.encoding option to allow overriding it.
  • gitattributes still have the final word.

The rationale for this is Windows support:

  1. Windows people are accustomed to using legacy encodings for text files. For many of them defaulting to utf-8 will be counter-intuitive.
  2. Windows doesn't support utf-8 locales, and switching the system encoding is a real pain. Thus the option.
like image 198
Michael Borgwardt Avatar answered Oct 09 '22 14:10

Michael Borgwardt