Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a standard way to do an fopen with a Unicode string file path?

Tags:

c

unicode

fopen

Is there a standard way to do an fopen with a Unicode string file path?

like image 200
Brian R. Bondy Avatar asked Dec 28 '08 19:12

Brian R. Bondy


2 Answers

No, there's no standard way. There are some differences between operating systems. Here's how different OSs handle non-ASCII filenames.

Linux

Under Linux, a filename is simply a binary string. The convention on most modern distributions is to use UTF-8 for non-ASCII filenames. But in the beginning, it was common to encode filenames as ISO-8859-1. It's basically up to each application to choose an encoding, so you can even have different encodings used on the same filesystem. The LANG environment variable can give you a hint what the preferred encoding is. But these days, you can probably assume UTF-8 everywhere.

This is not without problems, though, because a filename containing an invalid UTF-8 sequence is perfectly valid on most Linux filesystems. How would you specify such a filename if you only support UTF-8? Ideally, you should support both UTF-8 and binary filenames.

OS X

The HFS filesystem on OS X uses Unicode (UTF-16) filenames internally. Most C (and POSIX) library functions like fopen accept UTF-8 strings (since they're 8-bit compatible) and convert them internally.

Windows

The Windows API uses UTF-16 for filenames, but fopen uses the current codepage, whatever that is (UTF-8 just became an option). Many C library functions have a non-standard equivalent that accepts UTF-16 (wchar_t on Windows). For example, _wfopen instead of fopen.

like image 120
nwellnhof Avatar answered Sep 16 '22 14:09

nwellnhof


In *nix, you simply use the standard fopen (see more information in reply from TokeMacGuy, or in this forum)
In Windows, you can use _wfopen, and then pass a Unicode string (for more information, see MSDN).

As there is no real common way, I would wrap this call in a macro, together with all other system-dependent functions.

like image 22
rob Avatar answered Sep 17 '22 14:09

rob