Here is my minimal reproducible example:
#include <stdio.h>
int main( int argc, char* argv[])
{
printf (" this is the contents of argc:%d\n",argc);
int i;
for (i = 0; i < argc ; i++){
printf(" argv = %d = %s\n",i,argv[i]);
}
return 0;
}
When I change argc in the for loop into a number, lets say 10, the code crashes before it reaches 10:
$ ./argc one two three
this is the contents of argc:4
argv = 0 = ./argc
argv = 1 = one
argv = 2 = two
argv = 3 = three
argv = 4 = (null)
argv = 5 = SHELL=/bin/bash
argv = 6 = SESSION_MANAGER=local/wajih:@/tmp/.ICE-unix/1230,unix/wajih:/tmp/.ICE-unix/1230
argv = 7 = QT_ACCESSIBILITY=1
argv = 8 = COLORTERM=truecolor
argv = 9 = XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg
If I for example, change argc in the for loop to a 100; I get a very long error message, which ends with this:
argv = 54 = GDMSESSION=ubuntu
argv = 55 = DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
argv = 56 = LC_NUMERIC=ar_AE.UTF-8
argv = 57 = _=./argc
argv = 58 = OLDPWD=/home/wajih
argv = 59 = (null)
Segmentation fault (core dumped).
I want to understand the reason this happens.
It might be easier to understand what's going on here with an analogy.
Suppose I live in a long, narrow house. The house is divided into 10 rooms, but they're all the same size and they're all arranged in a straight line.
Suppose I'm interested in robotics. Suppose I build a little robot to drive around inside my house, taking pictures of each room. Because my house's rooms are all laid out in a straight line, the robot's navigation task is pretty simple.
Once I've got the robot's software working perfectly, I ask the robot to make a complete photographic survey of all 20 rooms in my house. (Oops, I made a mistake, there.) And the robot starts driving along the main axis of the house taking pictures of each room in turn.
After it takes pictures of the first 10 rooms, there's a crashing sound as the robot drives through the end wall of the house. Its pictures of the "11th room" are of splintered wood and plaster. Its pictures of the "12th room" are of the garden outside the end of my house. But then there's another crashing sound, and the robot keeps taking pictures, and somehow, remarkably, they look like the insides of a house again!
It turns out that's because the robot has driven into my neighbor's house and is now taking pictures there.
From this silly little story we can learn two things:
But the other important aspect of the analogy is that you obviously can't depend on any of it, because too many of the circumstances are outside of your control. The robot might have damaged itself so badly driving through walls that it couldn't continue taking pictures. If there happened to be a street just past the garden at the end of my house, the robot might have gotten run over by a truck. If there happened to be a cliff just past the garden at the end of my house, the robot might have fallen into the ocean. Etc.
C, like the simpleminded robot in my story, does not have any built-in protections against running off the end of arrays. If you try to access the 15th element of a 10-element array, what you don't typically get is an error message saying "Array bounds exceeded." What you get instead is something strange, unpredictable, and wrong — except that, depending on circumstances, there might seem to be some kind of hidden meaning, which might lead you to waste time trying to figure it out, or asking about it on Stack Overflow. But rather than doing that, you might want to spend your time working on a better obstacle detection or collision avoidance algorithm for the robot, instead. :-)
See also these previous SO questions on the topic of exceeding the bounds of arrays: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, and 14.
The argv pointer has a very specific location in the program's memory.
When you run a binary, there is always some entry point. In C, that is in the main() function. But, in order to prepare the environment for the binary to start at that location, the OS has to do some things first.
It has to copy over environment variables, request and offset memory from the OS, etc. Because this process is completely deterministic (per OS), you can actually expect to read the environment variables just after these arguments.

This principle is fundamental to computer security. If an attacker manages to leak a pointer in this segment of memory, they can overwrite some environment variable (i.e. PATH), to point to their own binary first. hackmd has a really nice example of this: HackMD: Environment variables attack.
Image source: COMPILER, ASSEMBLER, LINKER AND LOADER: A BRIEF STORY
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With