UTF-8 in Windows

Tags:

How do I set the code page to UTF-8 in a C Windows program?

I have a third party library that uses fopen to open files. I can use wcstombs to convert my Unicode filenames to the current code page, however if the user has a filename with a character outside the code page then this breaks.

Ideally I would just call _setmbcp(65001) to set the code page to UTF-8, however the MSDN documentation for _setmbcp states that UTF-8 is not supported.

How can I get around this?

652

asked Oct 03 '08 12:10

Michael Platings

1 Answers

Unfortunately, there is no way to make Unicode the current codepage in Windows. The CP_UTF7 and CP_UTF8 constants are pseudo-codepages, used only in MultiByteToWideChar and WideCharToMultiByte conversion functions, like Ben mentioned.

Your problem is similar to that of the fstream C++ classes. The fstream constructors accept only char* names, making impossible to open a file with a true Unicode name. The only solution offered by VC was a hack: open the file separately and then set the handle to the stream object. I'm afraid this isn't an option for you, of course, since the third party library probably doesn't accept handles.

The only solution I can think of is to create a temporary file with a non-Unicode name, which is hard-linked to the original, and use that as a parameter.

101

answered Sep 29 '22 05:09

efotinis

Related questions
                            
                                Why can I set an anonymous enum equal to another in C but not C++?
                            
                                Write to memory buffer instead of file with libjpeg?
                            
                                Sharing a global/static variable between a process and DLL
                            
                                Naming convention when using STRUCT in C
                            
                                find ones position in 64 bit number
                            
                                Using Doxygen with C, do you comment the function prototype or the definition? Or both?
                            
                                Is the PHP language resultantly C?
                            
                                What constitutes a "valid" C Identifier?
                            
                                Is there a need to close file descriptors before exit?
                            
                                What c lib to use when I need to parse a simple config file under linux? [closed]
                            
                                Why do I have to explicitly link with libm? [duplicate]
                            
                                How to use list from sys/queue.h?
                            
                                Replacing extrordinarily slow pow() function
                            
                                What does "@(#)" in comments mean?
                            
                                What is the difference between functions in math and functions in programming?
                            
                                Extracting precise frequencies from FFT Bins using phase change between frames
                            
                                Why does gcc -Wall give warning about zero-length format string?
                            
                                Inter-operability of Swift arrays with C?
                            
                                Void ** a generic pointer?
                            
                                how is select() alerted to an fd becoming "ready"?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

UTF-8 in Windows

Tags:

c

windows

unicode

utf-8

winapi

Michael Platings

People also ask

1 Answers

efotinis

Recent Activity

Donate For Us