Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does set_locale(LC_CTYPE, 'C'); actually do?

When my PHP script is run with UTF-8 encoding, using non-ASCII characters, some PHP functions like strtolower() don't work.

I could use mb_strtolower, but this script can be run on all sorts of different platforms and configurations, and the multibyte string extension might not be available. I could check whether the function exists before use, but I have string functions littered throughout my code and would rather not replace every instance.

Someone suggested using set_locale(LC_CTYPE, 'C'), which he says causes the string functions to work correctly. This sounds fine, but I don't want to introduce that change without understanding exactly what it is doing. I have used set_locale to change the formatting of numbers before, but I have not used the LC_CTYPE flag before, and I don't really understand what it does. What does the value 'C' mean?

like image 638
Russ Avatar asked Mar 08 '11 11:03

Russ


People also ask

What does Setlocale do in C?

The setlocale function installs the specified system locale or its portion as the new C locale. The modifications remain in effect and influences the execution of all locale-sensitive C library functions until the next call to setlocale .

What does LC_ ALL do?

The LC_ALL variable sets all locale variables output by the command 'locale -a'. It is a convenient way of specifying a language environment with one variable, without having to specify each LC_* variable. Processes launched in that environment will run in the specified locale.

What does Setlocale mean?

The setlocale() function is used to set or query the program's current locale. If locale is not NULL, the program's current locale is modified according to the arguments. The argument category determines which parts of the program's current locale should be modified.

How to set locale to UTF 8 in C?

To enable UTF-8 mode, use ". UTF8" as the code page when using setlocale . For example, setlocale(LC_ALL, ". UTF8") will use the current default Windows ANSI code page (ACP) for the locale and UTF-8 for the code page.


1 Answers

C means "use whatever locale is hard coded" (and since most *NIX programs are written in C, it's called C). However, it is usually not an UTF-8 locale.

If you are using multibyte charsets such as UTF-8 you cannot use the regular string functions - using the mb_ counterparts is required. However, almost every PHP installation should have this extension enabled.

like image 82
ThiefMaster Avatar answered Nov 05 '22 13:11

ThiefMaster