What causes a char to be signed or unsigned when using gcc?

Tags:

What causes if a char in C (using gcc) is signed or unsigned? I know that the standard doesn't dictate one over the other and that I can check CHAR_MIN and CHAR_MAX from limits.h but I want to know what triggers one over the other when using gcc

If I read limits.h from libgcc-6 I see that there is a macro __CHAR_UNSIGNED__ which defines a "default" char signed or unsigned but I'm unsure if this is set by the compiler at (his) built time.

I tried to list GCCs predefined makros with

$ gcc -dM -E -x c /dev/null | grep -i CHAR #define __UINT_LEAST8_TYPE__ unsigned char #define __CHAR_BIT__ 8 #define __WCHAR_MAX__ 0x7fffffff #define __GCC_ATOMIC_CHAR_LOCK_FREE 2 #define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2 #define __SCHAR_MAX__ 0x7f #define __WCHAR_MIN__ (-__WCHAR_MAX__ - 1) #define __UINT8_TYPE__ unsigned char #define __INT8_TYPE__ signed char #define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2 #define __CHAR16_TYPE__ short unsigned int #define __INT_LEAST8_TYPE__ signed char #define __WCHAR_TYPE__ int #define __GCC_ATOMIC_CHAR16_T_LOCK_FREE 2 #define __SIZEOF_WCHAR_T__ 4 #define __INT_FAST8_TYPE__ signed char #define __CHAR32_TYPE__ unsigned int #define __UINT_FAST8_TYPE__ unsigned char

but wasn't able to find __CHAR_UNSIGNED__

Background: I've some code which I compile on two different machines:

Desktop PC:

Debian GNU/Linux 9.1 (stretch)
gcc version 6.3.0 20170516 (Debian 6.3.0-18)
Intel(R) Core(TM) i3-4150
libgcc-6-dev: 6.3.0-18
char is signed

Raspberry Pi3:

Raspbian GNU/Linux 9.1 (stretch)
gcc version 6.3.0 20170516 (Raspbian 6.3.0-18+rpi1)
ARMv7 Processor rev 4 (v7l)
libgcc-6-dev: 6.3.0-18+rpi
char is unsigned

So the only obvious difference is the CPU architecture...

560

asked Sep 28 '17 07:09

Andy

1 Answers

According to the C11 standard (read n1570), char can be signed or unsigned (so you actually have two flavors of C). What exactly it is is implementation specific.

Some processors and instruction set architectures or application binary interfaces favor a signed character (byte) type (e.g. because it maps nicely to some machine code instruction), other favor an unsigned one.

gcc has even some -fsigned-char or -funsigned-char option which you should almost never use (because changing it breaks some corner cases in calling conventions and ABIs) unless you recompile everything, including your C standard library.

You could use feature_test_macros(7) and <endian.h> (see endian(3)) or autoconf on Linux to detect what your system has.

In most cases, you should write portable C code, which does not depend upon those things. And you can find cross-platform libraries (e.g. glib) to help you in that.

BTW gcc -dM -E -x c /dev/null also gives __BYTE_ORDER__ etc, and if you want an unsigned 8 bit byte you should use <stdint.h> and its uint8_t (more portable and more readable). And standard limits.h defines CHAR_MIN and SCHAR_MIN and CHAR_MAX and SCHAR_MAX (you could compare them for equality to detect signed chars implementations), etc...

BTW, you should care about character encoding, but most systems today use UTF-8 everywhere. Libraries like libunistring are helpful. See also this and remember that practically speaking an Unicode character encoded in UTF-8 can span several bytes (i.e. char-s).

answered Sep 17 '22 15:09

Basile Starynkevitch

Related questions
                            
                                How to link using GCC without -l nor hardcoding path for a library that does not follow the libNAME.so naming convention?
                            
                                Can the linker inline functions?
                            
                                How to use netlink socket to communicate with a kernel module?
                            
                                How to read, understand, analyze, and debug a Linux kernel panic?
                            
                                Repeated typedefs - invalid in C but valid in C++?
                            
                                Are char * argv[] arguments in main null terminated?
                            
                                Your preferred C/C++ header policy for big projects? [closed]
                            
                                Including a header file from another directory
                            
                                Warning: array subscript has type char
                            
                                What is "..." in switch-case in C code
                            
                                Braces around string literal in char array declaration valid? (e.g. char s[] = {"Hello World"})
                            
                                Different Pointer Arithmetic Results when Taking Address of Array
                            
                                What does double underscore ( __const) mean in C?
                            
                                How does fork() know when to return 0?
                            
                                What is the correct type for array indexes in C?
                            
                                Is there a way to mark a chunk of allocated memory readonly?
                            
                                How to draw text using only OpenGL methods?
                            
                                What are the differences between a compiler and a linker?
                            
                                Using C Libraries for C++ Programs
                            
                                GCC style weak linking in Visual Studio?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With