Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the safest way to pass strings around in C?

I have a program in C using Solaris with VERY ancient compatibility it seems. Many examples, even here on SO, don't work, as well as lots of code I've written on Mac OS X.

So when using very strict C, what is the safest way to pass strings?

I'm currently using char pointers all over the place, due to what I thought was simplicity. So I have functions that return char*, I'm passing char* to them, etc.

I'm already seeing strange behavior, like a char* I passed having its value right when I enter a function, and then the value being mysteriously gone OR corrupted/overwritten after something simple like one printf() or an malloc to some other pointer.

One approach to the functions, which I'm sure is incorrect, could be:

char *myfunction(char *somestr) {    
  char localstr[MAX_STRLENGTH] = strcpy(localstr, somestr);
  free(somestr);
  /* ... some work ... */
  char *returnstr = strdup(localstr);
  return returnstr;
}

This seems...sloppy. Can anyone point me in the right direction on a simple requirement?

Update

One example of a function where I am at a loss for what is happening. Not sure if this is enough to figure it out, but here goes:'

char *get_fullpath(char *command, char *paths) {
  printf("paths inside function %s\n", paths); // Prints value of paths just fine

  char *fullpath = malloc(MAX_STRLENGTH*sizeof(char*));

  printf("paths after malloc %s\n", paths); // paths is all of a sudden just blank
}
like image 938
chucknelson Avatar asked Apr 15 '10 00:04

chucknelson


2 Answers

Well-written C code adheres to the following convention:

  • All functions return a status code of type int, where a return value of 0 indicates success, and a -1 indicates failure. On failure, the function should set errno with an appropriate value (e.g. EINVAL).
  • Values that are "reported" by a function should be reported via the use of "out parameters". In other words, one of the parameters should be a pointer to the destination object.
  • Ownership of pointers should belong to the invoker; consequently, a function should not free any of its parameters, and should only free objects that it, itself, allocates with malloc/calloc.
  • Strings should be passed either as const char* objects or as char* objects, depending on whether the string is to be overwritten. If the string is not to be modified, then const char* should be used.
  • Whenever an array is passed that is not a NUL-terminated string, a parameter should be provided indicating the the number of elements in the array or the capacity of that array.
  • When a modifiable string/buffer (i.e. char*) object is passed into a function, and that function is to overwrite, append, or otherwise modify the string, a parameter indicating the capacity of the string/buffer needs to be provided (so as to allow for dynamic buffer sizes and to avoid bufffer overflow).

I should point out that in your example code, you are returning localstr and not returnstr. Consequently, you are returning an address of an object in the current function's stack frame. The current function's stack frame will disappear once the function has returned. Invoking another function immediately afterwards will likely alter the data in that location, leading to the corruption that you have observed. Returning the address of a local variable leads to "undefined behavior" and is incorrect.

Edit
Based on your updated code (get_fullpath), it is clear that the problem is not in your function get_fullpath, but rather in the function that is calling it. Most likely, the paths variable is being supplied by a function that returns the address of a local variable. Consequently, when you create a local variable within get_fullpath, it is using the same exact location on the stack that paths previously occupied. Since "paths" is aliasing "fullpaths", it is basically overwritten with the address of the buffer that you've malloced, which is blank.

Edit 2
I have created a C Coding Conventions page on my website with more detailed recommendations, explanations, and examples for writing C code, in case you are interested. Also, the statement that localstr is being returned instead of returnstr is no longer true since the question has last been edited.

like image 196
Michael Aaron Safyan Avatar answered Oct 25 '22 03:10

Michael Aaron Safyan


You can't return a pointer to an array that's allocated locally within the function. As soon as the function returns, that array is going to be clobbered.

Also, when you put

char localstr[MAX_STRLENGTH] = strcpy(localstr, somestr);

what happens is that strcpy() will copy the bytes into the localstr[] array, but then you have an unnecessary assignment thing going on. You could probably get the intended effect as two lines, thus ..

char localstr[MAX_STRLENGTH];
strcpy(localstr, somestr);

Also, it's bad form to embed a free() call inside a function like this. Ideally the free() should be visible at the same level of scope where the malloc() occurred. By the same logic it's a little dubious to allocate memory down in a function this way.

If you want a function to modify a string, a common convention goes something like so

// use a prototype like this to use the same buffer for both input and output
int modifyMyString(char buffer[], int bufferSize) {
    // .. operate you find in buffer[],
    //    leaving the result in buffer[]
    //    and be sure not to exceed buffer length
    // depending how it went, return EXIT_FAILURE or maybe
    return EXIT_SUCCESS;

// or separate input and outputs
int workOnString(char inBuffer[], int inBufSize, char outBuffer[], int outBufSize) {
    // (notice, you could replace inBuffer with const char *)
    // leave result int outBuffer[], return pass fail status
    return EXIT_SUCCESS;

Not embedding malloc() or free() inside will also help avoid memory leaks.

like image 38
JustJeff Avatar answered Oct 25 '22 02:10

JustJeff