How does C know what type to expect?

Tags:

If all values are nothing more than one or more bytes, and no byte can contain metadata, how does the system keep track of what sort of number a byte represents? Looking into Two's Complement and Single Point on Wikipedia reveals how these numbers can be represented in base-two, but I'm still left wondering how the compiler or processor (not sure which I'm really dealing with here) determines that this byte must be a signed integer.

It is analogous to receiving an encrypted letter and, looking at my shelf of cyphers, wondering which one to grab. Some indicator is necessary.

If I think about what I might do to solve this problem, two solutions come to mind. Either I would claim an additional byte and use it to store a description, or I would allocate sections of memory specifically for numerical representations; a section for signed numbers, a section for floats, etc.

I'm dealing primarily with C on a Unix system but this may be a more general question.

856

asked Mar 01 '13 17:03

Jack Stout

4 Answers

how does the system keep track of what sort of number a byte represents?

"The system" doesn't. During translation, the compiler knows the types of the objects it's dealing with, and generates the appropriate machine instructions for dealing with those values.

answered Nov 12 '22 14:11

John Bode

Ooh, good question. Let's start with the CPU - assuming an Intel x86 chip.

It turns out the CPU does not know whether a byte is "signed" or "unsigned." So when you add two numbers - or do any operation - a "status register" flag is set.

Take a look at the "sign flag." When you add two numbers, the CPU does just that - adds the numbers and stores the result in a register. But the CPU says "if instead we interpreted these numbers as twos complement signed integers, is the result negative?" If so, then that "sign flag" is set to 1.

So if your program cares about signed vs unsigned, writing in assembly, you would check the status of that flag and the rest of your program would perform a different task based on that flag.

So when you use signed int versus unsigned int in C, you are basically telling the compiler how (or whether) to use that sign flag.

answered Nov 12 '22 13:11

poundifdef

The code that is executed has no information about the types. The only tool that knows the types is the compiler at the time it compiles the code. Types in C are solely a restriction at compile time to prevent you from using the wrong type somewhere. While compiling, the C compiler keeps track of the type of each variable and therefore knows which type belongs to which variable.

This is the reason why you need to use format strings in printf, for example. printf has no chance of knowing what type it will get in the parameter list as this information is lost. In languages like go or java you have a runtime with reflection capabilities which makes it possible to get the type.

Suppose your compiled C code would still have type information in it, there would be the need for the resulting assembler language to check for types. It turns out that the only thing close to types in assembly is size of the operands for an instruction determined by suffixes (in GAS). So what is left from your type information is the size and nothing more.

One example for assembly which supports type is the java VM bytecode, which has type suffixes for operands for primitives.

answered Nov 12 '22 14:11

nemo

It is important to remember that C and C++ are high level languages. The compiler's job is to take the plain text representation of the code and build it into the platform specific instructions the target platform is expecting to execute. For most people using PCs this tends to be x86 assembly.

This is why C and C++ are so loose with how they define the basic data types. For example most people say there are 8 bits in a byte. This is not defined by the standard and there is nothing against some machine out there having 7 bits per byte as its native interpretation of data. The standard only recognizes that a byte is the smallest addressable unit of data.

So the interpretation of data is up to the instruction set of the processor. In many modern languages there is another abstraction on top of this, the Virtual Machine.

If you write your own scripting language it is up to you to define how you interpret your data in software.

answered Nov 12 '22 13:11

Matthew Sanders

Related questions
                            
                                Time left until Windows suspend
                            
                                c preprocessor passing multiple arguments as one
                            
                                Check in C++ that a struct is well aligned or contains gaps
                            
                                Compiler: What if condition is always true / false
                            
                                Throwing C++ exception through C function call
                            
                                Efficiently find least significant set bit in a large array?
                            
                                How to read a barcode from an image [closed]
                            
                                Why does GLib redefine types?
                            
                                Eigenvector computation using OpenCV
                            
                                C readline function
                            
                                Loading two instances of a shared library
                            
                                C/C++ encrypt/decrypt with public key
                            
                                Valgrind Error: failed in UME with error 22
                            
                                How to profile a continuously running server running on FreeBSD [duplicate]
                            
                                ifstream equivalent of FILE *'s rewind method
                            
                                Causes of Linux UDP packet drops
                            
                                Rendering LaTeX to an image
                            
                                C Win32: save .bmp image from HBITMAP
                            
                                Metaprogramming C/C++ using the preprocessor
                            
                                Structure of arrays and array of structures - performance difference

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does C know what type to expect?

Tags:

c

memory

twos-complement

single-precision