In C, why can I see a value written past the end of an array in a different variable?

Tags:

I've spent my spare time doing some test, fun things and implementing some various things like simple algorithm, data-structure for my personal joy in C these days....

But, I ended up finding out something interesting to me. I do not know why this result is happening until now..

max_arr_count_index is assigned depending on arr[5] value, which is past the end of the array +1.

Is there someone who can explain this to me? I know it should not be. I assigned the value the past one index of the array (here, arr[5] = 30 in the problem case) and it's not safe, and it is undefined behavior as defined by the standard.

I am not gonna do the same thing in the real field, But, I just want to get more under the hood here.

LLVM and GCC have given me the same result.

Code and result is below:

[No Problem case: I do not assign the value past end of the index]

Click to copy

#include <stdio.h>

int arr[] = {11,33,55,77,88};
int max_arr_count_index = (sizeof(arr) / sizeof(arr[0]));

// print all
void print_all_arr(int* arr)
{
    // just print all arr datas regarding index.
    for(int i = 0; i < max_arr_count_index; i++) {
        printf("arr[%d] = %d \n", i, arr[i]);
    }
}

int main(int argc, const char * argv[]) {
    // insert code here...
    printf("[before]max_arr_count_index : %d\n", max_arr_count_index);
    printf("[before]The original array elements are :\n");
    print_all_arr(arr);
    arr[0] = 1;
    arr[1] = 2;
    arr[2] = 3;
    arr[3] = 4;
    arr[4] = 5;
    // arr[5] = 1000;
    printf("[after]max_arr_count_index : %d\n", max_arr_count_index);
    printf("[after]The array elements after :\n");

    print_all_arr(arr);

    return 0;
}

No problem result is below:

Click to copy

[before]max_arr_count_index : 5
[before]The original array elements are :
arr[0] = 11 
arr[1] = 33 
arr[2] = 55 
arr[3] = 77 
arr[4] = 88 
[after]max_arr_count_index : 5
[after]The array elements after :
arr[0] = 1 
arr[1] = 2 
arr[2] = 3 
arr[3] = 4 
arr[4] = 5 
Program ended with exit code: 0

[Problem case: I assigned the value past end of the index]

Click to copy

#include <stdio.h>

int arr[] = {11,33,55,77,88};
int max_arr_count_index = (sizeof(arr) / sizeof(arr[0]));

// print all
void print_all_arr(int* arr)
{
    // just print all arr datas regarding index.
    for(int i = 0; i < max_arr_count_index; i++) {
        printf("arr[%d] = %d \n", i, arr[i]);
    }
}

int main(int argc, const char * argv[]) {
    // insert code here...
    printf("[before]max_arr_count_index : %d\n", max_arr_count_index);
    printf("[before]The original array elements are :\n");
    print_all_arr(arr);
    arr[0] = 1;
    arr[1] = 2;
    arr[2] = 3;
    arr[3] = 4;
    arr[4] = 5;

    /* Point is this one. 
       If I assign arr[5] 30, then, max_arr_count_index is changed also as            
       30. if I assign arr[5] 10000 max_arr_count_index is assigned 10000.
    */

    arr[5] = 30;

    /* Point is this one. 
       If I assign arr[5] 30, then, max_arr_count_index is changed also as            
       30. if I assign arr[5] 10000 max_arr_count_index is assigned 10000.
    */

    printf("[after]max_arr_count_index : %d\n", max_arr_count_index);
    printf("[after]The array elements after arr[5] is assigned 30 :\n");

    print_all_arr(arr);

    return 0;
}

Result is below :

Click to copy

[before]max_arr_count_index : 5
[before]The original array elements are :
arr[0] = 11 
arr[1] = 33 
arr[2] = 55 
arr[3] = 77 
arr[4] = 88 
[after]max_arr_count_index : 30
[after]The array elements after arr[5] is assigned 30 :
arr[0] = 1 
arr[1] = 2 
arr[2] = 3 
arr[3] = 4 
arr[4] = 5 
arr[5] = 30 
arr[6] = 0 
arr[7] = 0 
arr[8] = 0 
arr[9] = 0 
arr[10] = 0 
arr[11] = 0 
arr[12] = 0 
arr[13] = 0 
arr[14] = 0 
arr[15] = 0 
arr[16] = 0 
arr[17] = 0 
arr[18] = 0 
arr[19] = 0 
arr[20] = 0 
arr[21] = 0 
arr[22] = 0 
arr[23] = 0 
arr[24] = 0 
arr[25] = 0 
arr[26] = 0 
arr[27] = 0 
arr[28] = 0 
arr[29] = 0 
Program ended with exit code: 0

270

asked Nov 04 '16 16:11

boraseoksoon

2 Answers

So obviously, as far as the C standard is concerned, this is undefined behaviour, and the compiler could make fly demons out of your nose and it would be fine-ish.

But you want to go deeper, as you ask for "under the hood", so we would essentially have to look for the assembler output. An excerpt (produced with gcc -g test test.c and objdump -S --disassemble test) is:

Click to copy

int main(int argc, const char * argv[]) {
 743:   55                      push   %rbp
 744:   48 89 e5                mov    %rsp,%rbp
 747:   48 83 ec 10             sub    $0x10,%rsp
 74b:   89 7d fc                mov    %edi,-0x4(%rbp)
 74e:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
    // insert code here...
    printf("[before]max_arr_count_index : %d\n", max_arr_count_index);
 752:   8b 05 fc 08 20 00       mov    0x2008fc(%rip),%eax        # 201054 <max_arr_count_index>
 758:   89 c6                   mov    %eax,%esi
 75a:   48 8d 3d 37 01 00 00    lea    0x137(%rip),%rdi        # 898 <_IO_stdin_used+0x18>
 761:   b8 00 00 00 00          mov    $0x0,%eax
 766:   e8 35 fe ff ff          callq  5a0 <printf@plt>
    printf("[before]The original array elements are :\n");
 76b:   48 8d 3d 4e 01 00 00    lea    0x14e(%rip),%rdi        # 8c0 <_IO_stdin_used+0x40>
 772:   e8 19 fe ff ff          callq  590 <puts@plt>
    print_all_arr(arr);
 777:   48 8d 3d c2 08 20 00    lea    0x2008c2(%rip),%rdi        # 201040 <arr>
 77e:   e8 6d ff ff ff          callq  6f0 <print_all_arr>
    arr[0] = 1;
 783:   c7 05 b3 08 20 00 01    movl   $0x1,0x2008b3(%rip)        # 201040 <arr>
 78a:   00 00 00 
    arr[1] = 2;
 78d:   c7 05 ad 08 20 00 02    movl   $0x2,0x2008ad(%rip)        # 201044 <arr+0x4>
 794:   00 00 00 
    arr[2] = 3;
 797:   c7 05 a7 08 20 00 03    movl   $0x3,0x2008a7(%rip)        # 201048 <arr+0x8>
 79e:   00 00 00 
    arr[3] = 4;
 7a1:   c7 05 a1 08 20 00 04    movl   $0x4,0x2008a1(%rip)        # 20104c <arr+0xc>
 7a8:   00 00 00 
    arr[4] = 5;
 7ab:   c7 05 9b 08 20 00 05    movl   $0x5,0x20089b(%rip)        # 201050 <arr+0x10>
 7b2:   00 00 00 
    /* Point is this one. 
       If I assign arr[5] 30, then, max_arr_count_index is changed also as            
       30. if I assign arr[5] 10000 max_arr_count_index is assigned 10000.
    */

    arr[5] = 30;
 7b5:   c7 05 95 08 20 00 1e    movl   $0x1e,0x200895(%rip)        # 201054 <max_arr_count_index>
 7bc:   00 00 00 
    /* Point is this one. 
       If I assign arr[5] 30, then, max_arr_count_index is changed also as            
       30. if I assign arr[5] 10000 max_arr_count_index is assigned 10000.
    */

    printf("[after]max_arr_count_index : %d\n", max_arr_count_index);
 7bf:   8b 05 8f 08 20 00       mov    0x20088f(%rip),%eax        # 201054 <max_arr_count_index>
 7c5:   89 c6                   mov    %eax,%esi
 7c7:   48 8d 3d 22 01 00 00    lea    0x122(%rip),%rdi        # 8f0 <_IO_stdin_used+0x70>
 7ce:   b8 00 00 00 00          mov    $0x0,%eax
 7d3:   e8 c8 fd ff ff          callq  5a0 <printf@plt>
    printf("[after]The array elements after insertion :\n");
 7d8:   48 8d 3d 39 01 00 00    lea    0x139(%rip),%rdi        # 918 <_IO_stdin_used+0x98>
 7df:   e8 ac fd ff ff          callq  590 <puts@plt>

    print_all_arr(arr);
 7e4:   48 8d 3d 55 08 20 00    lea    0x200855(%rip),%rdi        # 201040 <arr>
 7eb:   e8 00 ff ff ff          callq  6f0 <print_all_arr>

    return 0;
 7f0:   b8 00 00 00 00          mov    $0x0,%eax
}

As you can see, even at that level, the disassembler already knows that you are effectively setting max_arr_count_index. But why?

It is because the memory layout produced by GCC is simply that way (and we used -g with gcc to make it embed debug information so that the disassembler can know which memory location is which field). You have a global array of five ints, and a global int variable, declared right after each other. The global int variable is simply right behind the array in memory. Accessing the integer right behind the end of the array thus gives max_arr_count_index.

Remember that access to an element i of an array arr of e.g. ints is (at least on all architectures I know) simply accessing the memory location arr+sizeof(int)*i, where arr is the address of the first element.

As said, this is undefined behaviour. GCC could also order the global int variable before the array, which would lead to different effects, possibly even the program terminating when attempting to access arr[5] if there is no valid memory page at that location.

answered Nov 08 '22 18:11

Jonas Schäfer

Accessing array out of bounds invoke undefined behavior. Nothing good can be expected in this case. Size of arr is 5. You can access arr from arr[0] to arr[4].

Taking UB aside for an instant, the explanation for the behavior

Click to copy

/* Point is this one. 
   If I assign arr[5] 30, then, max_arr_count_index is changed also as            
   30. if I assign arr[5] 10000 max_arr_count_index is assigned 10000.
*/

could be the variable max_arr_count_index is declared just after the array arr. Compiler may allocated the memory for max_arr_count_index just past the last element the array arr. For example, if arr[4] is at 0x100 then memory for max_arr_count_index is allocated at 0x104. So past the array arr is address 0x104. Since &arr[5] is the same address as of max_arr_count_index, assigning a value to arr[5] write that value to the address of max_arr_count_index. Please note that this is not what exactly happening. Its an intuition for this behavior. Once there is UB then all bets off.

answered Nov 08 '22 19:11

haccks

Related questions
                            
                                SIGCHLD Signal Processing
                            
                                Share Array between lua and C
                            
                                Important and handy tools and commands while developing C applications in Linux [closed]
                            
                                What does UnsignedSaturate in SSE instruction mean?
                            
                                C++ vs Java for server application [closed]
                            
                                Parameter name omitted error?
                            
                                How much faster is C than R in practice?
                            
                                Determine which single bit in the byte is set
                            
                                About catching the SIGSEGV in multithreaded environment
                            
                                Why can't a struct have a member that is of the same type as itself?
                            
                                What does the following C macro do?
                            
                                CMAKE cross compile libraries are not found
                            
                                How to cut video with FFmpeg C API
                            
                                How do both of these function pointer calling syntax variations work?
                            
                                Why memory addresses are even numbers?
                            
                                how to know the memory footprint of my binary executable
                            
                                Why aren't the C-supplied integer types good enough for basically any project?
                            
                                How to convert this C code to C++?
                            
                                Why are stackoverflow errors chaotic?
                            
                                bit shifting with unsigned long type produces wrong results

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In C, why can I see a value written past the end of an array in a different variable?

Tags:

arrays

c

pointers

boundary

boraseoksoon

People also ask

2 Answers

Jonas Schäfer

haccks

Recent Activity

Donate For Us