Derived Data Types in C While there are five data types that are primary, various derived data types are also present in C language that help in storing complex types of data.
Main types. The C language provides the four basic arithmetic type specifiers char, int, float and double, and the modifiers signed, unsigned, short, and long. The following table lists the permissible combinations in specifying a large set of storage size-specific declarations.
C provides various types of data-types which allow the programmer to select the appropriate type for the variable to set its value. The data-type in a programming language is the collection of data with values having fixed meaning as well as characteristics. Some of them are an integer, floating point, character, etc.
Advertisements. Data types in c refer to an extensive system used for declaring variables or functions of different types. The type of a variable determines how much space it occupies in storage and how the bit pattern stored is interpreted.
Yes, there are data types not directly supported.
On many embedded systems, there is no hardware floating point unit. So, when you write code like this:
float x = 1.0f, y = 2.0f;
return x + y;
It gets translated into something like this:
unsigned x = 0x3f800000, y = 0x40000000;
return _float_add(x, y);
Then the compiler or standard library has to supply an implementation of _float_add()
, which takes up memory on your embedded system. If you're counting bytes on a really tiny system, this can add up.
Another common example is 64-bit integers (long long
in the C standard since 1999), which are not directly supported by 32-bit systems. Old SPARC systems didn't support integer multiplication, so multiplication had to be supplied by the runtime. There are other examples.
By comparison, other languages have more complicated primitives.
For example, a Lisp symbol requires a lot of runtime support, just like tables in Lua, strings in Python, arrays in Fortran, et cetera. The equivalent types in C are usually either not part of the standard library at all (no standard symbols or tables) or they are much simpler and don't require much runtime support (arrays in C are basically just pointers, nul-terminated strings are almost as simple).
A notable control structure missing from C is exception handling. Nonlocal exit is limited to setjmp()
and longjmp()
, which just save and restore certain parts of processor state. By comparison, the C++ runtime has to walk the stack and call destructors and exception handlers.
Actually, I'll bet that the contents of this introduction haven't changed much since 1978 when Kernighan and Ritchie first wrote them in the First Edition of the book, and they refer to the history and evolution of C at that time more than modern implementations.
Computers are fundamentally just memory banks and central processors, and each processor operates using a machine code; part of the design of each processor is an instruction set architecture, called an Assembly Language, which maps one-to-one from a set of human-readable mnemonics to machine code, which is all numbers.
The authors of the C language – and the B and BCPL languages that immediately preceded it – were intent upon defining constructs in the language that were as efficiently compiled into Assembly as possible ... in fact, they were forced to by limitations in the target hardware. As other answers have pointed out, this involved branches (GOTO and other flow control in C), moves (assignment), logical operations (& | ^), basic arithmetic (add, subtract, increment, decrement), and memory addressing (pointers). A good example is the pre-/post-increment and decrement operators in C, which supposedly were added to the B language by Ken Thompson specifically because they were capable of translating directly to a single opcode once compiled.
This is what the authors meant when they said "supported directly by most computers". They didn't mean that other languages contained types and structures that were not supported directly - they meant that by design C constructs translated most directly (sometimes literally directly) into Assembly.
This close relation to the underlying Assembly, while still providing all the elements required for structured programming, are what led to C's early adoption, and what keep it a popular language today in environments where efficiency of code compiled is still key.
For an interesting write-up of the history of the language, see The Development of the C Language - Dennis Ritchie
The short answer is, most of the language constructs supported by C are also supported by the target computer's microprocessor, therefore, compiled C code translates very nicely and efficient to the microprocessor's assembly language, thereby resulting in smaller code and a smaller footprint.
The longer answer requires a little bit of assembly language knowledge. In C, a statement such as this:
int myInt = 10;
would translate to something like this in assembly:
myInt dw 1
mov myInt,10
Compare this to something like C++:
MyClass myClass;
myClass.set_myInt(10);
The resulting assembly language code (depending on how big MyClass() is), could add up to hundreds of assembly language lines.
Without actually creating programs in assembly language, pure C is probably the "skinniest" and "tightest" code you can make a program in.
EDIT
Given the comments on my answer, I decided to run a test, just for my own sanity. I created a program called "test.c", which looked like this:
#include <stdio.h>
void main()
{
int myInt=10;
printf("%d\n", myInt);
}
I compiled this down to assembly using gcc. I used the following command line to compile it:
gcc -S -O2 test.c
Here is the resulting assembly language:
.file "test.c"
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "%d\n"
.section .text.unlikely,"ax",@progbits
.LCOLDB1:
.section .text.startup,"ax",@progbits
.LHOTB1:
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB24:
.cfi_startproc
movl $10, %edx
movl $.LC0, %esi
movl $1, %edi
xorl %eax, %eax
jmp __printf_chk
.cfi_endproc
.LFE24:
.size main, .-main
.section .text.unlikely
.LCOLDE1:
.section .text.startup
.LHOTE1:
.ident "GCC: (Ubuntu 4.9.1-16ubuntu6) 4.9.1"
.section .note.GNU-stack,"",@progbits
I then create a file called "test.cpp" which defined a class and outputted the same thing as "test.c":
#include <iostream>
using namespace std;
class MyClass {
int myVar;
public:
void set_myVar(int);
int get_myVar(void);
};
void MyClass::set_myVar(int val)
{
myVar = val;
}
int MyClass::get_myVar(void)
{
return myVar;
}
int main()
{
MyClass myClass;
myClass.set_myVar(10);
cout << myClass.get_myVar() << endl;
return 0;
}
I compiled it the same way, using this command:
g++ -O2 -S test.cpp
Here is the resulting assembly file:
.file "test.cpp"
.section .text.unlikely,"ax",@progbits
.align 2
.LCOLDB0:
.text
.LHOTB0:
.align 2
.p2align 4,,15
.globl _ZN7MyClass9set_myVarEi
.type _ZN7MyClass9set_myVarEi, @function
_ZN7MyClass9set_myVarEi:
.LFB1047:
.cfi_startproc
movl %esi, (%rdi)
ret
.cfi_endproc
.LFE1047:
.size _ZN7MyClass9set_myVarEi, .-_ZN7MyClass9set_myVarEi
.section .text.unlikely
.LCOLDE0:
.text
.LHOTE0:
.section .text.unlikely
.align 2
.LCOLDB1:
.text
.LHOTB1:
.align 2
.p2align 4,,15
.globl _ZN7MyClass9get_myVarEv
.type _ZN7MyClass9get_myVarEv, @function
_ZN7MyClass9get_myVarEv:
.LFB1048:
.cfi_startproc
movl (%rdi), %eax
ret
.cfi_endproc
.LFE1048:
.size _ZN7MyClass9get_myVarEv, .-_ZN7MyClass9get_myVarEv
.section .text.unlikely
.LCOLDE1:
.text
.LHOTE1:
.section .text.unlikely
.LCOLDB2:
.section .text.startup,"ax",@progbits
.LHOTB2:
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB1049:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $10, %esi
movl $_ZSt4cout, %edi
call _ZNSolsEi
movq %rax, %rdi
call _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
xorl %eax, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE1049:
.size main, .-main
.section .text.unlikely
.LCOLDE2:
.section .text.startup
.LHOTE2:
.section .text.unlikely
.LCOLDB3:
.section .text.startup
.LHOTB3:
.p2align 4,,15
.type _GLOBAL__sub_I__ZN7MyClass9set_myVarEi, @function
_GLOBAL__sub_I__ZN7MyClass9set_myVarEi:
.LFB1056:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $_ZStL8__ioinit, %edi
call _ZNSt8ios_base4InitC1Ev
movl $__dso_handle, %edx
movl $_ZStL8__ioinit, %esi
movl $_ZNSt8ios_base4InitD1Ev, %edi
addq $8, %rsp
.cfi_def_cfa_offset 8
jmp __cxa_atexit
.cfi_endproc
.LFE1056:
.size _GLOBAL__sub_I__ZN7MyClass9set_myVarEi, .-_GLOBAL__sub_I__ZN7MyClass9set_myVarEi
.section .text.unlikely
.LCOLDE3:
.section .text.startup
.LHOTE3:
.section .init_array,"aw"
.align 8
.quad _GLOBAL__sub_I__ZN7MyClass9set_myVarEi
.local _ZStL8__ioinit
.comm _ZStL8__ioinit,1,1
.hidden __dso_handle
.ident "GCC: (Ubuntu 4.9.1-16ubuntu6) 4.9.1"
.section .note.GNU-stack,"",@progbits
As you can clearly see, the resulting assembly file is much larger on the C++ file then it is on the C file. Even if you cut out all the other stuff and just compare the C "main" to the C++ "main", there is a lot of extra stuff.
K&R mean that most C expressions (technical meaning) map to one or a few assembly instructions, not a function call to a support library. The usual exceptions are integer division on architectures without a hardware div instruction, or floating point on machines with no FPU.
There's a quote:
C combines the flexibility and power of assembly language with the user-friendliness of assembly language.
(found here. I thought I remembered a different variation, like "speed of assembly language with the convenience and expressivity of assembly language".)
Some higher level languages define the exact width of their data types, and implementations on all machines must work the same. Not C, though.
If you want to work with 128bit ints on x86-64, or in the general case BigInteger of arbitrary size, you need a library of functions for it. All CPUs now use 2s complement as the binary representation of negative integers, but even that wasn't the case back when C was designed. (That's why some things that would give different results on non 2s-complement machines are technically undefined in the C standards.)
If you want ref-counted references, you have to do it yourself. If you want c++ virtual member functions that call a different function depending on what kind of object your pointer is pointing to, the C++ compiler has to generate a lot more than just a call
instruction with a fixed address.
Outside of library functions, the only string operations provided are read/write a character. No concat, no substring, no search. (Strings are stored as nul-terminated ('\0'
) arrays of 8bit integers, not pointer+length, so to get a substring you'd have to write a nul into the original string.)
CPUs sometimes have instructions designed for use by a string-search function, but still usually process one byte per instruction executed, in a loop. (or with the x86 rep prefix. Maybe if C was designed on x86, string search or compare would be a native operation, rather than a library function call.)
Many other answers give examples of things that aren't natively supported, like exception handling, hash tables, lists. K&R's design philosophy is the reason C doesn't have any of these natively.
The assembly language of a process generally deals with jump (go to), statements, move statements, binary arthritic (XOR, NAND, AND OR, etc), memory fields (or address). Categorizes memory into two types, instruction and data. That is about all an assembly language is (I am sure assembly programmers will argue there is more to it than that, but it boils down to this in general). C closely resembles this simplicity.
C is to assemble what algebra is to arithmetic.
C encapsulates the basics of assembly (the processor's language). Is probably a truer statement than "Because the data types and control structures provided by C are supported directly by most computers"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With