Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding C pointers, arrays and negative indices

Tags:

arrays

c

pointers

I am trying to learn pointers in C, and taking a quiz for this purpose. Here is the question:

#include <stdio.h>

char *c[] = {"GeksQuiz", "MCQ", "TEST", "QUIZ"};
char **cp[] = {c+3, c+2, c+1, c};
char ***cpp = cp;

int main()
{
    printf("%s ", **++cpp);
    printf("%s ", *--*++cpp+3);
    printf("%s ", *cpp[-2]+3);
    printf("%s ", cpp[-1][-1]+1);
    return 0;
}

The result of the line:

 printf("%s ", *cpp[-2]+3);

confuses me, but let me explain step by step, how I understand that.

  • char *c[] - is array of pointers to char.
  • char **cp[] - is array of pointers that points to pointer to char (I consider this as a wrapper for *c[] in reverse order).
  • char ***cpp - is a pointer to pointer that points to pointer to char (I consider this a wrapper for **cp[] to perform on place modifications).

**++cpp - since cpp points to cp, then ++cpp will point to cp+1 which is c+2, so double dereference will print TEST.

*--*++cpp+3 - since now cpp points to cp+1, then ++cpp will point to cp+2 which is c+1, and the next operation -- will give us pointer to c, so the last dereference will print sQuiz.

Here comes confusion:

cpp[-2] - since now cpp points to cp+2, which I can confirm with

printf("%p\n", cpp); // 0x601090   
printf("%p\n", cp+2); // 0x601090

here I print addresses of pointers in c

printf("c - %p\n", c); // c - 0x601060
printf("c+1 - %p\n", c+1); // c+1 - 0x601068
printf("c+2 - %p\n", c+2); // c+2 - 0x601070
printf("c+3 - %p\n", c+3); // c+3 - 0x601078

so when I dereference like this *(cpp[0]) or **cpp I expectedly get the value MCQ of c+1

printf("%p\n", &*(cpp[0])); // 0x601068

but when I say *(cpp[-2]) I get QUIZ, but I would rather expect to get some garbage value.

So my questions are:

  1. How the magic with *--*++cpp+3 works, I mean what is modified by the -- part that allows me to get MCQ instead of TEST when I dereference like this **cpp, I assume that this pointer *++cpp+3 preserves the state after the -- is applied, but cannot imagine yet how it works.

  2. Why the following works the way it works (the cpp[-2] part):

    printf("%p\n", &*cpp[1]); // 0x601060 -> c
    printf("%p\n", &*(cpp[0])); // 0x601068 -> c+1
    printf("%p\n", &*(cpp[-1])); // 0x601070 -> c+2
    printf("%p", &*(cpp[-2])); // 0x601078 -> c+3
    

It seems that it has reverse order, I can accept &*(cpp[0]) pointing to c+1, but I would expect &*cpp[1] to point to c+2, and &*(cpp[-1]) to c. Which I found in this question: Are negative array indexes allowed in C?

  1. I obviously confuse many things, and may call something a pointer that in reality is not one, I would like to grasp the concept of pointer, so will be glad if someone show me where I am wrong.
like image 699
MisterAlejandro Avatar asked May 25 '18 18:05

MisterAlejandro


1 Answers

Let me clarify first the negative index confusion, since we will use it later to answer the other questions:

when I say *(cpp[-2]) I get QUIZ, but I would rather expect to get some garbage value.

Negative values are fine. Note the following:

By definition, the subscript operator E1[E2] is exactly identical to *((E1)+(E2)).

Knowing that, since cpp == cp+2, then:

cpp[-2] == *(cpp-2) == *(cp+2-2) == *cp == c+3

And therefore:

*cpp[-2]+3 == *(c+3)+3 == c[3]+3

Which means the address of "QUIZ" plus 3 positions of a char pointer, so you are passing to printf the address of the character Z in "QUIZ", which means it will start printing the string from there.

Actually, in case you wonder, -2[cpp] is also equivalent and valid.


Now, the questions:

  1. How the magic with *--*++cpp+3 works, I mean what is modified by the -- part that allows me to get MCQ instead of TEST when I dereference like this **cpp, I assume that this pointer *++cpp+3 preserves the state after the -- is applied, but cannot imagine yet how it works.

Let's break it down (recall that cpp == cp+1 here, as you correctly point out):

    ++cpp   // cpp+1 == cp+2 (and saving this new value in cpp)
   *++cpp   // *(cp+2) == cp[2]
 --*++cpp   // cp[2]-1 == c (and saving this new value in cp[2])
*--*++cpp   // *c
*--*++cpp+3 // *c+3

And that points to sQuiz as you correctly pointed out. However, cpp and cp[2] were modified, so you now have:

cp[] == {c+3, c+2, c, c}
cpp  == cp+2

The fact that cp[2] changed is not used in the rest of the question, but it is important to note -- specially since you printed the values of the pointers. See:

  1. Why the following works the way it works (the cpp[-2] part):

    printf("%p\n", &*cpp[1]); // 0x601060 -> c
    printf("%p\n", &*(cpp[0])); // 0x601068 -> c+1
    printf("%p\n", &*(cpp[-1])); // 0x601070 -> c+2
    printf("%p", &*(cpp[-2])); // 0x601078 -> c+3
    

First, let's simplify &*x into x. Then, doing something similar as above, if cpp == cp+2 (as above), you can see that:

cpp[ 1] == cp[3] == c
cpp[ 0] == cp[2] == c   // Note this is different to what you had
cpp[-1] == cp[1] == c+2
cpp[-2] == cp[0] == c+3
  1. I obviously confuse many things, and may call something a pointer that in reality is not one, I would like to grasp the concept of pointer, so will be glad if someone show me where I am wrong.

You actually got it quite well! :-)

Basically, a pointer is an integer that represents a memory address. However, when you perform arithmetic on it, it takes into account the size of the type it points to. That is the reason why, if c == 0x601060 and sizeof(char*) == 8, then:

c+1 == 0x601060 + 1*sizeof(char*) == 0x601068 // Instead of 0x601061
c+2 == 0x601060 + 2*sizeof(char*) == 0x601070 // Instead of 0x601062
like image 125
Acorn Avatar answered Nov 15 '22 20:11

Acorn