Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are the pointers to strings in argv modifiable? [duplicate]

Recently (Jan 2016, in case the question persists long enough) we had the question Are the strings in argv modifiable?.
In the comment section to this answer, we (@2501 and I) argued whether it is really the strings of characters (an example character being **argv) that's modifiable or the pointers to the strings (an example pointer being *argv).

The appropriate standard quotation is from the C11 standard draft N1570, §5.1.2.2.1/2:

The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

So are the pointers to the strings as pointed to by argv modifiable?

like image 586
cadaniluk Avatar asked Jan 30 '16 19:01

cadaniluk


2 Answers

As OP quoted in the question, the C11 standard explicitly states that the argc and argv variables, and the strings pointed by the argv array, are modifiable. Whether those pointers are modifiable or not, is the question at hand. The standard does not seem to explicitly state it one way or the other.

There are two key points to note about the wording in the standard:

  1. If the pointers were supposed to be immutable, the standard could have made it clear by requiring main to be declared as int main(int argc, char *const argv[]), as haccks mentioned in another answer to this question.

    The fact that nowhere in the standard is const mentioned in association with argv seems deliberate. That is, the lack of const does not seem optional, but dictated by the standard.

  2. The standard calls argv consistently an array. Modifying an array refers to modifying its members. Thus, it seems obvious that the wording in the standard refers to modifying the members in the argv array, when it states that argv is modifiable.

    On the other hand, array parameters in C (based on C11 draft N1570, §6.7.6.3p7) "shall be adjusted to 'qualified pointer to type'". Thus, the following code,

    int foo(int x[2], int y[2])
    {
        if (x[0] > y[0])
            x = y;
        return x[1];
    }
    

    is valid C11, since x and y are adjusted to int *x and int *y, respectively. (This is also reiterated in C11 draft N1570, §6.3.2.1p3: "... array ... is converted to an expression with type 'pointer to type' that points to the initial element of the array ...".) Obviously, the same would not be, if x and y were declared as local or global arrays, not function parameters.

As far as language-lawyerism goes, I'd say the standard does not state it one way or another, although it implies the pointers too should be modifiable. Thus, as an answer to OP: both.


In practice, there is a very long tradition of the pointers in the argv array being modifiable. Many libraries have initialization functions that take a pointer to argc and a pointer to the argv array, and some of them do modify the pointers in the argv array (removing options specific to the library); for example GTK+ gtk_init() and MPI_Init() (although at least OpenMPI explicitly states it does not examine or modify them). Look for parameter declaration (int *argc, char ***argv); the only reason for this -- assuming the intent is to be called from main() using (&argc, &argv) -- is to modify the pointers, to parse and remove the library-specific command-line parameters from the command-line parameters, modifying both argc and the pointers in argv as needed.

(I originally stated that the getopt() facility in POSIX relies on the pointers being modifiable -- the feature dating back to 1980, adopted by most Unix flavours, and standardized in POSIX.2 in 1997 -- but that is incorrect, as Jonathan Leffler pointed out in a comment: POSIX getopt() does not modify the actual pointers; only GNU getopt() does, and it only when the POSIXLY_CORRECT environment variable is not set. Both GNU getopt_long() and BSD getopt_long() modify the pointers unless POSIXLY_CORRECTis set, but they are much younger and less widespread compared to getopt().)

In the Unix land, it was considered "portable" to modify the contents of the strings pointed to by argv[] array, and have the modified strings visible in the process list. One example of how this was useful is in DJB's daemontools package, readproctitle. (Note that the strings would have to be modified in-place, and cannot be extended, for the changes to be visible in the process list.)

All this indicates a very long tradition, basically almost since the birth of C as a programming language, and definitely preceding the standardization of C, of treating argc, argv, the pointers in the argv array, and the contents of the strings pointed to by those pointers, as modifiable.

Because the intent of the C standard is not to define new behaviour, but codify existing behaviour across implementations (to promote portability and reliability and so on), it seems safe to assume that it was an unintended omission on part of the standard writers to not explicitly specify the pointers in the argv array as modifiable. Anything else would break tradition, and be explicitly contrary to the POSIX standard (which is also intended to promote portability across systems, and extends C features not included in the ISO C standard).

like image 130
Nominal Animal Avatar answered Oct 23 '22 00:10

Nominal Animal


Whether a pointer is modifiable or not depends on constness of the pointer. The parameter argv is declared as char *argv[] or char **argv. It depends on the environment whether they treat this as char *const argv[] or not (I am not aware of any).

like image 32
haccks Avatar answered Oct 22 '22 22:10

haccks