Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 in Windows

How do I set the code page to UTF-8 in a C Windows program?

I have a third party library that uses fopen to open files. I can use wcstombs to convert my Unicode filenames to the current code page, however if the user has a filename with a character outside the code page then this breaks.

Ideally I would just call _setmbcp(65001) to set the code page to UTF-8, however the MSDN documentation for _setmbcp states that UTF-8 is not supported.

How can I get around this?

like image 652
Michael Platings Avatar asked Oct 03 '08 12:10

Michael Platings


People also ask

Does Windows 10 use UTF-8?

Starting in Windows 10 build 17134 (April 2018 Update), the Universal C Runtime supports using a UTF-8 code page.

Can Windows read UTF-8?

On Windows, the native encoding cannot be UTF-8 nor any other that could represent all Unicode characters. Windows sometimes replaces characters by similarly looking representable ones (“best-fit”), which often works well but sometimes has surprising results, e.g. alpha character becomes letter a.

How do I change my default encoding to UTF-8?

Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.


1 Answers

Unfortunately, there is no way to make Unicode the current codepage in Windows. The CP_UTF7 and CP_UTF8 constants are pseudo-codepages, used only in MultiByteToWideChar and WideCharToMultiByte conversion functions, like Ben mentioned.

Your problem is similar to that of the fstream C++ classes. The fstream constructors accept only char* names, making impossible to open a file with a true Unicode name. The only solution offered by VC was a hack: open the file separately and then set the handle to the stream object. I'm afraid this isn't an option for you, of course, since the third party library probably doesn't accept handles.

The only solution I can think of is to create a temporary file with a non-Unicode name, which is hard-linked to the original, and use that as a parameter.

like image 101
efotinis Avatar answered Sep 29 '22 05:09

efotinis