I understand that execve() and family require the first argument of its argument array to be the same as the executable that is also pointed to by its first argument. That is, in this:
execve(prog, args, env);
args[0] will usually be the same as prog. But I can't seem to find information as to why this is.
I also understand that executables (er, at least shell scripts) always have their calling path as the first argument when running, but I would think that the shell would do the work to put it there, and execve() would just call the executable using the path given in its first argument ("prog" from above), then passing the argument array ("args" from above) as one would on the command line.... i.e., I don't call scripts on the command line with a duplicate executable path in the args list....
/bin/ls /bin/ls /home/john
Can someone explain?
There is no requirement that the first of the arguments bear any relation to the name of the executable:
int main(void)
{
char *args[3] = { "rip van winkle", "30", 0 };
execv("/bin/sleep", args);
return 1;
}
Try it - on a Mac (after three tests):
make x; ./x & sleep 1; ps
The output on the third run was:
MiniMac JL: make x; ./x & sleep 1; ps
make: `x' is up to date.
[3] 5557
PID TTY TIME CMD
5532 ttys000 0:00.04 -bash
5549 ttys000 0:00.00 rip van winkle 30
5553 ttys000 0:00.00 rip van winkle 30
5557 ttys000 0:00.00 rip van winkle 30
MiniMac JL:
EBM comments:
Yeah, and this makes it even more weird. In my test bash script (the target of the execve), I don't see the value of what execve has in arg[0] anywhere -- not in the environment, and not as $0.
Revising the experiment - a script called 'bash.script':
#!/bin/bash
echo "bash script at sleep (0: $0; *: $*)"
sleep 30
And a revised program:
int main(void)
{
char *args[3] = { "rip van winkle", "30", 0 };
execv("./bash.script", args);
return 1;
}
This yields the ps output:
bash script at sleep (0: ./bash.script; *: 30)
PID TTY TIME CMD
7804 ttys000 0:00.11 -bash
7829 ttys000 0:00.00 /bin/bash ./bash.script 30
7832 ttys000 0:00.00 sleep 30
There are two possibilities as I see it:
#!/bin/bash
') line, orHow to establish the difference? I suppose copying the shell to an alternative name, and then using that alternative name in the shebang would tell us something:
$ cp /bin/bash jiminy.cricket
$ sed "s%/bin/bash%$PWD/jiminy.cricket%" bash.script > tmp
$ mv tmp bash.script
$ chmod +w bash.script
$ ./x & sleep 1; ps
[1] 7851
bash script at sleep (0: ./bash.script; *: 30)
PID TTY TIME CMD
7804 ttys000 0:00.12 -bash
7851 ttys000 0:00.01 /Users/jleffler/tmp/soq/jiminy.cricket ./bash.script 30
7854 ttys000 0:00.00 sleep 30
$
This, I think, indicates that the kernel rewrites argv[0]
when the shebang mechanism is used.
Addressing the comment by nategoose:
MiniMac JL: pwd
/Users/jleffler/tmp/soq
MiniMac JL: cat al.c
#include <stdio.h>
int main(int argc, char **argv)
{
while (*argv)
puts(*argv++);
return 0;
}
MiniMac JL: make al.c
cc al.c -o al
MiniMac JL: ./al a b 'c d' e
./al
a
b
c d
e
MiniMac JL: cat bash.script
#!/Users/jleffler/tmp/soq/al
echo "bash script at sleep (0: $0; *: $*)"
sleep 30
MiniMac JL: ./x
/Users/jleffler/tmp/soq/al
./bash.script
30
MiniMac JL:
That shows that it is the shebang '#!/path/to/program' mechanism, rather than any program such as Bash, that adjusts the values of argv[0]
. So, when a binary is executed, the value of argv[0]
is not adjusted; when a script is executed via the shebang, the argument list is adjusted by the kernel; argv[0]
is the binary listed on the shebang; if there is an argument after the shebang, that becomes argv[1]
; the next argument is the name of the script file, followed by any remaining arguments from the execv()
or equivalent call.
MiniMac JL: cat bash.script
#!/Users/jleffler/tmp/soq/al -arg0
#!/bin/bash
#!/Users/jleffler/tmp/soq/jiminy.cricket
echo "bash script at sleep (0: $0; *: $*)"
sleep 30
MiniMac JL: ./x
/Users/jleffler/tmp/soq/al
-arg0
./bash.script
30
MiniMac JL:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With