How is utf-8 coded string printed to screen in C with printf?

Tags:

For below code in C:

char s[] = "这个问题";
printf("%s", s);

Knew that source file is "UTF-8 Unicode C program text" with file command.

How the string is coded after compile? Also utf-8 in the .out file?

When the binary file executed in bash, how the string is coded in memory? Is it also utf-8?

Then, how bash knows the coding scheme and show right character?

Last, now the bash know what to show, but how bytes translated to pixels on the screen? Is there some mapping from bytes to pixels?

In all these processes, is there any encoding or decoding of utf-8?

767

asked Feb 26 '16 09:02

heLomaN

1 Answers

Assuming GCC, this manual page says that the preprocessor will first translate the character set of the incoming files to the so called source character set, which for gcc is UTF-8. So for an UTF-8 file, nothing happens. The default execution character set is then used for string constants, and that is (again, for GCC) UTF-8 by default.

So your UTF-8 string "survives" and exists in the executable as a bunch of bytes in UTF-8 encoding.

The terminal also has a character set, and that has to match, the C program does nothing to further translate strings when printed, they're just printed as they are, byte for byte. If the terminal isn't set for UTF-8, you will just get garbage.

As I noted in a comment, bash has nothing to do with this.

107

answered Oct 10 '22 21:10

unwind

Related questions
                            
                                Floating Point Exception Core Dump
                            
                                Why i am not getting the expected output in the following c programme? [duplicate]
                            
                                Adding self-signed SSL certificate for libcurl
                            
                                Writing to multiple file-descriptors
                            
                                libuv: how to gracefully exit application on an error?
                            
                                Why do system calls return EFAULT instead of sending a segfault?
                            
                                How thread-safe is V4L2?
                            
                                How to put 2 sections in 1 segment (Using ld scripts)
                            
                                Boomerang: Unable to load libQtGUI
                            
                                One-instance only application in C and Linux [duplicate]
                            
                                Good sentinel value for double if prefer to use -ffast-math
                            
                                CMake - Automatically Parsing Dependencies of Precompiled Header?
                            
                                how to print float value upto 2 decimal place without rounding off [duplicate]
                            
                                The concept of a type name scope in C11
                            
                                How to create stub shared libraries on Linux
                            
                                How to use strerror_l with current locale?
                            
                                How to defeat hardware prefetcher in core i3/i7 in linux
                            
                                Implementing write(), _write() or _write_r() with Newlib?
                            
                                Pass FILE * into function from Python / ctypes
                            
                                How to best call functions with C99-style array function signatures from C++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How is utf-8 coded string printed to screen in C with printf?

Tags:

c

bash

encoding

utf-8

graphics

heLomaN

People also ask

1 Answers

unwind

Recent Activity

Donate For Us