Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Null byte and arrays in C

Tags:

arrays

c

If I declare a char array of say 10 chars like so...

char letters[10];

am I creating a set of memory locations that are represented as chars from index 0-9 then the 10th index is the null byte?

if so does that mean I'm really creating 11 locations in memory for the array (0 to 10) with the last element being the null byte or do I have 10 locations in memory (0 to 9) then C adds the null byte at a new position (so the array is 1 byte longer than I declared)?

Thanks

like image 928
CS Student Avatar asked Nov 23 '13 11:11

CS Student


3 Answers

Seems like you are confused with arrays and strings.
When you declare

char letters[10] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};  

then it reserves only 10 contiguous bytes in a memory location.

  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  //memory addresses. I assumed it is to be starting from 2000 for simplification. 
 +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
 |     |     |     |     |     |     |     |     |     |     |
 | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' |
 |     |     |     |     |     |     |     |     |     |     |
 +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

In C indexing starts from 0. You can access your allocated memory location from letters[0] to letters[9]. Accessing the location letters[10] will invoke undefined behavior. But when you declare like this

char *letters = "0123456789";  

or

char letters[11] = "0123456789"; 

then there are 11 bytes of space are allocated in memory; 10 for 0123456789 and one for \0 (NUL character).

 2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010 //memory addresses. I assumed it is to be starting from 2000 for simplification. 
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------+
|     |     |     |     |     |     |     |     |     |     |       |
| '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '\0'  |
|     |     |     |     |     |     |     |     |     |     |       |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------+  
                                                                ^
                                                                | NUL character   

Take another example

#include <stdio.h>

int main(){
   char arr[11];
   scanf("%s", arr);
   printf("%s", arr);

   return 0;
} 

Input:

asdf  

Output:

asdf

Now have a look at memory location

 +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------+
 |     |     |     |     |     |     |     |     |     |     |       |
 | 'a' | 's' | 'd' | 'f' |'\0' |     |     |     |     |     |       |
 |     |     |     |     |     |     |     |     |     |     |       |
 +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------+  
like image 115
haccks Avatar answered Oct 13 '22 16:10

haccks


am I creating a set of memory locations that are represented as chars from index 0-9

Yes

then the 10th index is the null byte?

No.

You reserved space for exactly 10 chars. Nothing else. Nothing will automatically set the last byte to zero, or act as if it were. There is no 11th char that could hold a zero, you only have 10.

If you're going to use that with string functions, it's your duty as the programmer to make sure that your string is null-terminated. (And here that means it can hold at most 9 significant characters.)

Some common examples with initialization:

// 10 chars exactly, not initialized - you have to take care of everything
char arr1[10];
// 10 chars exactly, all initialized - last 7 to zero - ok "C string"
char arr2[10] = { 'a', 'b', 'c' };
// three chars exactly, initialized to a, b and c - not a "C string"
char arr3[] = { 'a', 'b', 'c' };
// four chars exactly, initizalized to a, b, c and zero - ok "C string"
char arr4[] = "abc";
like image 10
Mat Avatar answered Oct 13 '22 16:10

Mat


And throughout your programming in [Turbo(C++), try to use F7, or F8 and Alt+F4, you can see what's happening inside that will be much useful for a beginner who having doubts like this

When ever you declaring a variable a seperate memory location will be alloted to that variable. In case of array variable like

char letters[10];

Ten memory space will get alloted to letters variable.

And the size of memory allocation will get vary for different datatype(i.e. int,char,float...).

Again in your case: if your want to store a name like "csstudent" in array you have declare an array size of "ten" even "csstudent" size is "nine", because the last index is to store "\0" character indicates the end of the string

//1000,1001 are memory space alloted,may will vary in you system

   1000  1001  1002  1003  1004  1005  1006  1007  1008  1009  
 +-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
 |     |     |     |     |     |     |     |     |     |      |
 | 'c' | 's' | 's' | 't' | 'u' | 'd' | 'e' | 'n' | 't' | '\0' |
 |     |     |     |     |     |     |     |     |     |      |
 +-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
like image 3
Suganthan Madhavan Pillai Avatar answered Oct 13 '22 17:10

Suganthan Madhavan Pillai