Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Two dimensional arrays with pointers

Tags:

c

In the below can someone explain the outputs obtained when executed:

#include<stdio.h>
void main()
{
char a[5][10]={"one","two","three","four","five"};
char **str=a;
printf("%p ", &a[0]);
printf("\n%p ", &str[0]);
printf("\n%p ", &str[3]);
printf("\n%p ", &str[1][56]);
printf("\n%p ", &(*(*(str+4)+1)));
}

Below is the output observed:

0xbf7f6286 
0xbf7f6286 
0xbf7f6292 
0x38 
0x1 
  • &a[0] is the address of the starting address of array
  • &str[0] is thesame

    Can some one explain how the address of str[3] is 0xbf7f6292. As per my understanding the &str[0] and &str[3] should differ by 30 bytes.

    Also please someone explain how the output for other two cases also.

Thanks in Advance

like image 715
user2165725 Avatar asked Mar 14 '13 04:03

user2165725


3 Answers

The Very Short Answer:

Change this:

char **str;

To be this:

char (*str)[10];

And fix any errors in your code thereafter. The type of str is not compatible with the type of your two dimensional array. I took the liberty of fixing the return type to be int for main() (void is not allowed per the C standard). I also marked the source where you technically have undefined behavior:

#include <stdio.h>
int main()
{
    char a[5][10]={"one","two","three","four","five"};
    char (*str)[10] = a;
    printf("%p ", &a[0]);
    printf("\n%p ", &str[0]);
    printf("\n%p ", &str[3]);
    printf("\n%p ", &str[1][56]);       // this is undefined behaviour
    printf("\n%p ", &(*(*(str+4)+1)));
    return 0;
}

You should also note the last printf() statement has a superfluous &(*(...)) that is not needed. It can be stripped to look like this, which is equivalent:

    printf("\n%p ", *(str+4)+1);

Output (System Dependent)

0x7fff5fbff900 
0x7fff5fbff900 
0x7fff5fbff91e 
0x7fff5fbff942 
0x7fff5fbff929 

The Very (Very, Very) Long Answer

You are incorrect in your assumption that a two dimensional array is equivalent with a pointer-to-pointer. They are similar in usage syntax-only. The following is a rough layout of how the array a is portrayed in memory

char a[5][10]

    0   1   2   3   4   5   6   7   8   9 
  -----------------------------------------
0 | o | n | e | 0 |   |   |   |   |   |   |
  -----------------------------------------
1 | t | w | o | 0 |   |   |   |   |   |   |
  -----------------------------------------
2 | t | h | r | e | e | 0 |   |   |   |   |
  -----------------------------------------
3 | f | o | u | r | 0 |   |   |   |   |   |
  -----------------------------------------
4 | f | i | v | e | 0 |   |   |   |   |   |
  -----------------------------------------

Note that the starting address of the second row, which is &a[1], is ten bytes past the starting address of the first row, &a[0] (which is no-coincidentally the starting address of the entire array). The following demonstrates this:

int main()
{
    char a[5][10] = { "one", "two", "three", "four", "five" };

    printf("&a    = %p\n", &a);
    printf("&a[0] = %p\n", &a[0]);
    printf("a[0]  = %p\n", a[0]);
    printf("&a[1] = %p\n", &a[1]);
    printf("a[1]  = %p\n", a[1]);
    return 0;
}

Output

&a    = 0x7fff5fbff8e0
&a[0] = 0x7fff5fbff8e0
a[0]  = 0x7fff5fbff8e0
&a[1] = 0x7fff5fbff8ea
a[1]  = 0x7fff5fbff8ea

Note that the address at a[1] is 10 bytes (0x0a hex) past the beginning of the array. But why? To answer this one must understand the basics of typed-pointer arithmetic.


How Does Typed-Pointer Arithmetic Work?

In C and C++, all pointers except void are typed. The have a fundamental data type associated with the pointer. For example.

int *iptr = malloc(5*sizeof(int));

iptr above points to a region of memory allocated to be the size of five integers. Addressing it as we normally address arrays looks like this:

iptr[1] = 1;

but we could just as easily address it like this:

*(iptr+1) = 1;

and the result would be the same; storage of the value 1 in the second array slot (0-slot being the first). We know the dereference operator * allows access through an address (the address stored in the pointer). But how does (iptr+1) know to skip four bytes (if your int types are 32bit, eight bytes if they're 64bit) to access the next integer slot?

Answer: because of typed-pointer arithmetic. The compiler knows how wide, in bytes, the underlying type of the pointer is (in this case, the width of the int type). When you add or subtract scaler values to/from pointers, the compiler generates the appropriate code to account for this "type-width". It also works with user types as well. This is demonstrated below:

#include <stdio.h>

typedef struct Data
{
    int ival;
    float fval;
    char buffer[100];
} Data;

int main()
{
    int ivals[10];
    int *iptr = ivals;
    char str[] = "Message";
    char *pchr = str;
    Data data[2];
    Data *dptr = data;

    printf("iptr    = %p\n", iptr);
    printf("iptr+1  = %p\n", iptr+1);

    printf("pchr    = %p\n", pchr);
    printf("pchr+1  = %p\n", pchr+1);

    printf("dptr    = %p\n", dptr);
    printf("dptr+1  = %p\n", dptr+1);

    return 0;
}

Output

iptr    = 0x7fff5fbff900
iptr+1  = 0x7fff5fbff904
pchr    = 0x7fff5fbff8f0
pchr+1  = 0x7fff5fbff8f1
dptr    = 0x7fff5fbff810
dptr+1  = 0x7fff5fbff87c

Notice the address difference between iptr and iptr+1 is not one byte; it is four bytes (the width of an int on my system). Next, the width of a single char is demonstrated with pchr and pchr+1 as one byte. Finally, our custom data type Data with its two pointer values, dptr and dptr+1 show that it is 0x6C, or 108 bytes wide. (it could have been larger due to structure packing and field alignment, but we were fortunate with this example that it was not). That makes sense, since the structure contains two 4-byte data fields (an int and a float) and a char buffer 100 elements wide.

The reverse is also true, by the way, and is often not considered by even experienced C/C++ programmers. It is typed pointer differencing. If you have two valid pointers of a specified type within a contiguous region of valid memory:

int ar[10];
int *iptr1 = ar+1;
int *iptr5 = ar+5;

What do you suppose you get as a result from:

printf("%lu", iptr5 - iptr1);

The answer is.. 4. Whoopee, you say. Not a big deal? Don't you believe it. This is extremely handy when using pointer arithmetic to determine the offset in a buffer of a specific element.

In summary, when you have an expression like this:

int ar[5];
int *iptr = ar;

iptr[1] = 1;

You can know it is equivalent to :

*(iptr+1) = 1;

Which, in lay-man's terms, means "Take the address held in the iptr variable, add 1*(width of an int in bytes) bytes to it, then store the value 1 in the memory dereferenced through the returned address."

Site Bar: This would also work. See if you can think of why

1[iptr] = 1;

Back to your (our) sample array, Now take a look at what happens when we refer to the same address of a, but through a double pointer (which is entirely incorrect and your compiler should at least warn you about the assignment):

char **str = a; // Error: Incompatible pointer types: char ** and char[5][10]

Well that doesn't work. But lets assume for a moment it did. char ** is a pointer to a pointer-to-char. This means the variable itself holds nothing more than a pointer. It has no concept of base row width, etc. Therefore assuming you put the address of a in the double-pointer str.

char **str = (char **)(a); // Should NEVER do this, here for demonstration only.
char *s0 = str[0];  // what do you suppose this is?

Slight update to our test program:

int main()
{
    char a[5][10] = { "one", "two", "three", "four", "five" };
    char **str = (char **)a;
    char *s0 = str[0];
    char *s1 = str[1];

    printf("&a    = %p\n", &a);
    printf("&a[0] = %p\n", &a[0]);
    printf("a[0]  = %p\n", a[0]);
    printf("&a[1] = %p\n", &a[1]);
    printf("a[1]  = %p\n", a[1]);

    printf("str   = %p\n", str);
    printf("s0    = %p\n", s0);
    printf("s1    = %p\n", s1);

    return 0;
}

Gives us the following result:

&a    = 0x7fff5fbff900
&a[0] = 0x7fff5fbff900
a[0]  = 0x7fff5fbff900
&a[1] = 0x7fff5fbff90a
a[1]  = 0x7fff5fbff90a
str   = 0x7fff5fbff900
s0    = 0x656e6f
s1    = 0x6f77740000

Well, str looks like it is what we want, but what is that thing in s0 ? Why, thats the ASCII character values for letters. Which ones? A quick check of a decent ASCII table shows the are:

0x65 : e
0x6e : n
0x6f : o

Thats the word "one" in reverse (the reverse is caused by the endian handling of multi-byte values on my system, but I hope the problem is obvious. What about that second value:

0x6f : o
0x77 : w
0x74 : t

Yep, thats "two" . So why are we getting the bytes in our array as pointers?

Hmmm.. Yeah, as I said in my comment. Whomever either told you or hinted to you that double pointers and two dimensional arrays are synonymous was incorrect. Recall our hiatus into typed pointer arithmetic. Remember, this:

str[1]

and this:

*(str+1)

are synonymous. Well. what is the type of the str pointer? The type it points to is a char pointer. therefore, the byte count difference between this:

str + 0

and this

str + 1

will be the size of a char*. On my system that is 4 bytes (I have 32bit pointers). This explains why the apparent address in str[1] is the data four bytes into the base of our original array.

Therefore, answering your first fundamental question (yes, we finally got to that).

Why str[3] is 0xbf7f6292

Answer: This:

&str[3]

is equivalent to this:

(str + 3)

But we know from above that (str + 3) is just the address stored in str, then adding 3x the width of the type str points to, which is a char *, in bytes to that address. Well. we know from your second printf what that address is:

0xbf7f6286

And we know the width of a pointer on your system is 4 bytes (32 bit pointers). Therefore...

0xbf7f6286 + (3 * 4)

or....

0xbf7f6286 + 0x0C = 0xbf7f6292
like image 83
WhozCraig Avatar answered Oct 04 '22 20:10

WhozCraig


First two printf are obviously print same address, i.e. first elememt of two-d array that is a one-d array. so there should be no doubt about that both of them are pointing to start of the array, that is a[0][0].

char **str is a pointer so when you do str[3] it internaly means (str+3) and most probably your code is running on a 32 bit machine so it will jump to 12 bytes ahead as per pointer airthematic. dont get confused with array, str is a 2-d pointer not an array.

0xbf7f6286 str[0] 0xbf7f6292 str[3] diff is 0xc that is correct as expected.

printf("\n%p ", &str[1][56]); printf("\n%p ", &(((str+4)+1)));

Above two last statement you were trying to print because of your wrong undersanding of pointer airthematic.

I hope it would help to solve your doubt.

like image 40
flying-high Avatar answered Oct 04 '22 20:10

flying-high


char a[5][10]={"one","two","three","four","five"};
char **str=a;

Let me explain how a two start pointer works , and what is its relationship with an array , so that you can have a better chance of trying to solve the question yourself.

Conventionally double star pointers are assigned with "address" of another pointer.

Example

char   a =  5;
char  *b = &a;
char **c = &b;

so when you de-reference c one time

printf("\n%d",*c)

, you will get the value contained by 'b' , which is nothing but the address of 'a'.

When you de-reference c two times

printf("\n%d",**c)

, you will get the value of 'a' directly.

Now coming to the relationship between arrays and pointers.

An array's name is a synonym to its address , meaning the name of the array is the location of itself.

Example:

int array [5] = {1,2,3,4,5};
printf("\n%p",array);
printf("\n%p",&array);
printf("\n%p",&array[0]);

all of them will print the same address.

Hence this is an invalid pointer assignment

char **str=a;

Because **str expects an address of a pointer holding an address , but you are just passing only an address[name of array = a].

So in short , a two star pointer , should not be used to handle a two dimensional array.

This question deals with handling a two dimensional array with a pointer.

like image 32
Barath Ravikumar Avatar answered Oct 04 '22 20:10

Barath Ravikumar