Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I print the string which __FILE__ expands to correctly?

Consider this program:

#include <stdio.h>
int main() {
    printf("%s\n", __FILE__);
    return 0;
}

Depending on the name of the file, this program works - or not. The issue I'm facing is that I'd like to print the name of the current file in an encoding-safe way. However, in case the file has funny characters which cannot be represented in the current code page, the compiler yields a warning (rightfully so):

?????????.c(3) : warning C4566: character represented by universal-character-name '\u043F' cannot be represented in the current code page (1252)

How do I tackle this? I'd like to store the string given by __FILE__ in e.g. UTF-16 so that I can properly print it on any other system at runtime (by converting the stored UTF-16 representation to whatever the runtime system uses). To do so, I need to know:

  1. What encoding is used for the string given by __FILE__? It seems that, at least on Windows, the current system code page (in my case, Windows-1252) is used - but this is just guessing. Is this true?
  2. How can I store the UTF-8 (or UTF-16) representation of that string in my source code at build time?

My real life use case: I have a macro which traces the current program execution, writing the current sourcecode/line number information to a file. It looks like this:

struct LogFile {
    // Write message to file. The file should contain the UTF-8 encoded data!
    void writeMessage( const std::string &msg );
};

// Global function which returns a pointer to the 'active' log file.
LogFile *activeLogFile();

#define TRACE_BEACON activeLogFile()->write( __FILE__ );

This breaks in case the current source file has a name which contains characters which cannot be represented by the current code page.

like image 440
Frerich Raabe Avatar asked Jul 20 '10 14:07

Frerich Raabe


Video Answer


2 Answers

Use can use the token pasting operator, like this:

#define WIDEN2(x) L ## x
#define WIDEN(x) WIDEN2(x)
#define WFILE WIDEN(__FILE__)

int main() {
    wprintf("%s\n", WFILE);
    return 0;
}
like image 176
Hans Passant Avatar answered Oct 06 '22 08:10

Hans Passant


__FILE__ will always expand to character string literal, thus in essence it will be compatible to char const*. This means that a compiler implementation has not much other choice than using the raw byte representation of the source file name as it presents itself at compile time.

Whether or not this is something sensible in the current locale or not doesn't matter, you could have a source file name that contains basically garbage, as long as your run time system and compiler accept it as a valid file name.

If you, as a user, have a different locale with different encoding than is used in your file system, you will see a lot of ???? or alike.

But if both your locales agree upon the encoding, a plain printf should suffice and your terminal (or whatever you use to look at the output) should be able to print the characters correctly.

So the short answer is, it will only work if your system is consistent w.r.t encoding. Otherwise your out of luck, since guessing encodings is a quite difficult task.

like image 1
Jens Gustedt Avatar answered Oct 06 '22 09:10

Jens Gustedt