Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are some of the drawbacks to using C-style strings?

Tags:

c++

c

string

I know that buffer overruns are one potential hazard to using C-style strings (char arrays). If I know my data will fit in my buffer, is it okay to use them anyway? Are there other drawbacks inherent to C-style strings that I need to be aware of?

EDIT: Here's an example close to what I'm working on:

char buffer[1024];
char * line = NULL;
while ((line = fgets(fp)) != NULL) { // this won't compile, but that's not the issue
    // parse one line of command output here.
}

This code is taking data from a FILE pointer that was created using a popen("df") command. I'm trying to run Linux commands and parse their output to get information about the operating system. Is there anything wrong (or dangerous) with setting the buffer to some arbitrary size this way?

like image 729
Bill the Lizard Avatar asked Nov 23 '08 14:11

Bill the Lizard


3 Answers

There are a few disadvantages to C strings:

  1. Getting the length is a relatively expensive operation.
  2. No embedded nul characters are allowed.
  3. The signed-ness of chars is implementation defined.
  4. The character set is implementation defined.
  5. The size of the char type is implementation defined.
  6. Have to keep track separately of how each string is allocated and so how it must be free'd, or even if it needs to be free'd at all.
  7. No way to refer to a slice of the string as another string.
  8. Strings are not immutable, meaning they must be synchronized separately.
  9. Strings cannot be manipulated at compile time.
  10. Switch cases cannot be strings.
  11. The C preprocessor does not recognize strings in expressions.
  12. Cannot pass strings as template arguments (C++).
like image 129
Walter Bright Avatar answered Oct 16 '22 04:10

Walter Bright


C strings lack the following aspects of their C++ counterparts:

  • Automatic memory management: you have to allocate and free their memory manually.
  • Extra capacity for concatenation efficiency: C++ strings often have a capacity greater than their size. This allows increasing the size without many reallocations.
  • No embedded NULs: by definition a NUL character ends a C string; C++ string keep an internal size counter so they don't need a special value to mark their end.
  • Sensible comparison and assignment operators: even though comparison of C string pointers is permitted, it's almost always not what was intended. Similarly, assigning C string pointers (or passing them to functions) creates ownership ambiguities.
like image 44
efotinis Avatar answered Oct 16 '22 06:10

efotinis


Not having the length accessible in constant-time is a serious overhead in many applications.

like image 14
Will Dean Avatar answered Oct 16 '22 05:10

Will Dean