Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I have three loops over an array of (char*) elements in C. Why does the third fail?

While experimenting with methods for stepping through an array of strings in C, I developed the following small program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


typedef char* string;

int main() {
  char *family1[4] = {"father", "mother", "son", NULL};
  string family2[4] = {"father", "mother", "son", NULL};

  /* Loop #1: Using a simple pointer to step through "family1". */
  for (char **p = family1; *p != NULL; p++) {
    printf("%s\n", *p);
  }
  putchar('\n');

  /* Loop #2: Using the typedef for clarity and stepping through
   * family2. */
  for (string *s = family2; *s != NULL; s++) {
    printf("%s\n", *s);
  }
  putchar('\n');

  /* Loop #3: Again, we use the pointer, but with a unique increment
   * step in our for loop.  This fails to work.  Why? */
  for (string s = family2[0]; s != NULL; s = *(&s + 1)) {
    printf("%s\n", s);
  }
}

My specific question involves the failure of Loop #3. When run through the debugger, Loops #1 and #2 complete successfully, but the last loop fails for an unknown reason. I would not have asked this here, except for the fact that is shows me that I have some critical misunderstanding regarding the "&" operator.

My question (and current understanding) is this:

family2 is an array-of-pointer-to-char. Thus, when s is set to family2[0] we have a (char*) pointing to "father". Therefore, taking &s should give us the equivalent of family2, pointing to the first element of family2 after the expected pointer decay. Why doesn't, then, *(&s + 1) point to the next element, as expected?

Many thanks,
lifecrisis


EDIT -- Update and Lessons Learned:

The following list is a summary of all of the relevant facts and interpretations that explain why the third loop does not work like the first two.

  1. s is a separate variable holding a copy of the value (a pointer-to-char) from the variable family2[0]. I.e., these two equivalent values are positioned at SEPARATE locations in memory.
  2. family2[0] up to family2[3] are contiguous elements of memory, and s has no presence in this space, though it does contain the same value that is stored in family2[0] at the start of our loop.
  3. These first two facts mean that &s and &family2[0] are NOT equal. Thus, adding one to &s will return a pointer to unknown/undefined data, whereas adding one to &family2[0] will give you &family2[1], as desired.
  4. In addition, the update step in the third for loop doesn't actually result in s stepping forward in memory on each iteration. This is because &s is constant throughout all iterations of our loop. This is the cause of the observed infinite loop.

Thanks to EVERYONE for their help!
lifecrisis

like image 539
lifecrisis Avatar asked Jan 10 '17 14:01

lifecrisis


People also ask

Why do we use for loops with arrays?

- [Instructor] When you have values in an array, it is common to want to perform actions on each one of the items. You can use a for loop to iterate through all the items in the array and access each element individually.

Can we use pointer in loop in C?

we assign the pointer to the array str to p. In C the following assignments have the same effect: p = &str[0]; p = str; “By definition, the value of a variable or expression of type array is the address of element zero of the array” (K & R (2)).

How are string represented in memory in C?

A string constant in C is represented by a sequence of characters within double quotes. Standard C character escape sequences like \n (newline), \r (carriage return), \a (bell), \0x17 (character with hexadecimal code 0x17), \\ (backslash), and \" (double quote) can all be used inside string constants.


2 Answers

When you do s = *(&s + 1) the variable s is a local variable in an implicit scope that only contains the loop. When you do &s you get the address of that local variable, which is unrelated to any of the arrays.

The difference from the previous loop is that there s is a pointer to the first element in the array.


To explain it a little more "graphically" what you have in the last loop is something like

+----+      +---+      +------------+
| &s | ---> | s | ---> | family2[0] |
+----+      +---+      +------------+

That is, &s is pointing to s, and s is pointing to family2[0].

When you do &s + 1 you effectively have something like

+------------+
| family2[0] |
+------------+
^
|
+---+----
| s | ...
+---+----
^   ^
|   |
&s  &s + 1
like image 101
Some programmer dude Avatar answered Sep 21 '22 12:09

Some programmer dude


Pictures help a lot:

            +----------+
            | "father" |                                    
            +----------+         +----------+      +-------+      NULL 
   /-----------→1000            | "mother" |      | "son" |        ↑
+-----+           ↑              +----------+      +-------+        |
|  s  | ?         |                  2000            2500           |
+-----+           |                   ↑                ↑            |
 6000  6008 +----------------+----------------+--------------+--------------+
            |   family2[0]   |   family2[1]   |  family2[2]  |  family2[3]  |
            +----------------+----------------+--------------+--------------+
                  5000              5008            5016           5024

                    (    &s refers to 6000    ) 
                    ( &s+1 refers to 6008 but )
                    (   *(&s+1) invokes UB    )

Addresses chosen as random integers for simplicity


The thing here is that, although both s and family2[0] point to the same base address of the string literal "father", the pointers aren't related with each other and has its own different memory location where they are stored. *(&s+1) != family2[1].

You hit UB when you do *(&s + 1) because &s + 1 is a memory location you're not supposed to tamper with, i.e, it doesn't belong to any object you created. You never know what's stored in there => Undefined Behavior.

Thanks @2501 for pointing out several mistakes!

like image 32
Spikatrix Avatar answered Sep 18 '22 12:09

Spikatrix