Why isn't UTF-8 the default encoding for GitHub?
Does that mean that there are any drawbacks to change from the default "cp1252"?
Does it have anything to do with using GitHub across platforms between Windows and Mac OSX?
The question is asked from using GitHub, but can be asked about Git in general.
I'm mostly doing development in .NET, HTML5 and Javascript, if that matters in the subject.
Most Microsoft Windows text files use "ANSI", "OEM", "Unicode" or "UTF-8" encoding.
UTF-16 is, obviously, more efficient for A) characters for which UTF-16 requires fewer bytes to encode than does UTF-8. UTF-8 is, obviously, more efficient for B) characters for which UTF-8 requires fewer bytes to encode than does UTF-16.
Wild guess: are you using TortoiseGit? Is that were you're seeing a default encoding set to cp1252?
If so, it's simply TortoiseGit using the default encoding of your Windows installation.
Edit: Exactly the same is true for the Git GUI
Here's a discussion from a git developer's mailing list giving an explanation:
- Make diffs and blame default to the system (locale) encoding instead of hard-coding UTF-8.
- Add a gui.encoding option to allow overriding it.
- gitattributes still have the final word.
The rationale for this is Windows support:
- Windows people are accustomed to using legacy encodings for text files. For many of them defaulting to utf-8 will be counter-intuitive.
- Windows doesn't support utf-8 locales, and switching the system encoding is a real pain. Thus the option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With