Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is it that UTF-8 encoding is used when interacting with a UNIX/Linux environment?

I know it is customary, but why? Are there real technical reasons why any other way would be a really bad idea or is it just based on the history of encoding and backwards compatibility? In addition, what are the dangers of not using UTF-8, but some other encoding (most notably, UTF-16)?

Edit : By interacting, I mostly mean the shell and libc.

like image 995
Carl Avatar asked Oct 02 '08 20:10

Carl


1 Answers

Partly because the file systems expect NUL ('\0') bytes to terminate file names, so UTF-16 would not work well. You'd have to modify a lot of code to make that change.

like image 157
Jonathan Leffler Avatar answered Oct 20 '22 21:10

Jonathan Leffler