Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

best practice for returning a variable length string in c

Tags:

c

function

string

I have a string function that accepts a pointer to a source string and returns a pointer to a destination string. This function currently works, but I'm worried I'm not following the best practice regrading malloc, realloc, and free.

The thing that's different about my function is that the length of the destination string is not the same as the source string, so realloc() has to be called inside my function. I know from looking at the docs...

http://www.cplusplus.com/reference/cstdlib/realloc/

that the memory address might change after the realloc. This means I have can't "pass by reference" like a C programmer might for other functions, I have to return the new pointer.

So the prototype for my function is:

//decode a uri encoded string
char *net_uri_to_text(char *);

I don't like the way I'm doing it because I have to free the pointer after running the function:

char * chr_output = net_uri_to_text("testing123%5a%5b%5cabc");
printf("%s\n", chr_output); //testing123Z[\abc
free(chr_output);

Which means that malloc() and realloc() are called inside my function and free() is called outside my function.

I have a background in high level languages, (perl, plpgsql, bash) so my instinct is proper encapsulation of such things, but that might not be the best practice in C.

The question: Is my way best practice, or is there a better way I should follow?

full example

Compiles and runs with two warnings on unused argc and argv arguments, you can safely ignore those two warnings.

example.c:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char *net_uri_to_text(char *);

int main(int argc, char ** argv) {
  char * chr_input = "testing123%5a%5b%5cabc";
  char * chr_output = net_uri_to_text(chr_input);
  printf("%s\n", chr_output);
  free(chr_output);
  return 0;
}

//decodes uri-encoded string
//send pointer to source string
//return pointer to destination string
//WARNING!! YOU MUST USE free(chr_result) AFTER YOU'RE DONE WITH IT OR YOU WILL GET A MEMORY LEAK!
char *net_uri_to_text(char * chr_input) {
  //define variables
  int int_length = strlen(chr_input);
  int int_new_length = int_length;
  char * chr_output = malloc(int_length);
  char * chr_output_working = chr_output;
  char * chr_input_working = chr_input;
  int int_output_working = 0;
  unsigned int uint_hex_working;
  //while not a null byte
  while(*chr_input_working != '\0') {
    //if %
    if (*chr_input_working == *"%") {
      //then put correct char in
      sscanf(chr_input_working + 1, "%02x", &uint_hex_working);
      *chr_output_working = (char)uint_hex_working;
      //printf("special char:%c, %c, %d<\n", *chr_output_working, (char)uint_hex_working, uint_hex_working);
      //realloc
      chr_input_working++;
      chr_input_working++;
      int_new_length -= 2;
      chr_output = realloc(chr_output, int_new_length);
      //output working must be the new pointer plys how many chars we've done
      chr_output_working = chr_output + int_output_working;
    } else {
      //put char in
      *chr_output_working = *chr_input_working;
    }
    //increment pointers and number of chars in output working
    chr_input_working++;
    chr_output_working++;
    int_output_working++;
  }
  //last null byte
  *chr_output_working = '\0';
  return chr_output;
}
like image 802
Michael Avatar asked Jun 12 '13 17:06

Michael


1 Answers

It's perfectly ok to return malloc'd buffers from functions in C, as long as you document the fact that they do. Lots of libraries do that, even though no function in the standard library does.

If you can compute (a not too pessimistic upper bound on) the number of characters that need to be written to the buffer cheaply, you can offer a function that does that and let the user call it.

It's also possible, but much less convenient, to accept a buffer to be filled in; I've seen quite a few libraries that do that like so:

/*
 * Decodes uri-encoded string encoded into buf of length len (including NUL).
 * Returns the number of characters written. If that number is less than len,
 * nothing is written and you should try again with a larger buffer.
 */
size_t net_uri_to_text(char const *encoded, char *buf, size_t len)
{
    size_t space_needed = 0;

    while (decoding_needs_to_be_done()) {
        // decode characters, but only write them to buf
        // if it wouldn't overflow;
        // increment space_needed regardless
    }
    return space_needed;
}

Now the caller is responsible for the allocation, and would do something like

size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH;
char *result = xmalloc(len);

len = net_uri_to_text(input, result, len);
if (len > SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH) {
    // try again
    result = xrealloc(input, result, len);
}

(Here, xmalloc and xrealloc are "safe" allocating functions that I made up to skip NULL checks.)

like image 63
Fred Foo Avatar answered Oct 27 '22 15:10

Fred Foo