Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strings in C: pitfalls and techniques

Tags:

c

string

I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).

Do you know of any lists (like cheat sheets) or your own experience in the matter?

I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.

Thank you.

Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.

like image 326
Dervin Thunk Avatar asked Aug 17 '09 22:08

Dervin Thunk


People also ask

What is the problem with C string?

Disadvantages of C-stringsWorking with C-strings is not intuitive. Functions are required to compare strings, and the output of the strcmp functions is not intuitive either. For functions like strcpy and strcat , the programmer is required to remember the correct argument order for each call.

What is string concept in C?

String in C programming is a sequence of characters terminated with a null character '\0'. Strings are defined as an array of characters. The difference between a character array and a string is the string is terminated with a unique character '\0'.

What is string in C explain with example?

In C programming, a string is a sequence of characters terminated with a null character \0 . For example: char c[] = "c string"; When the compiler encounters a sequence of characters enclosed in the double quotation marks, it appends a null character \0 at the end by default. Memory Diagram.

Can we change string in C?

The only difference is that you cannot modify string literals, whereas you can modify arrays. Functions that take a C-style string will be just as happy to accept string literals unless they modify the string (in which case your program will crash).


2 Answers

It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte. C strings aren't all that user-friendly as you probably know.

  • Writing a zero byte somewhere in the string will truncate it.
  • Going out of bounds generally ends bad.
  • Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
  • Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
  • Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
  • You can't compare strings with
    if( s1 == s2 )
                doStuff(s1);
    You have to compare every character in the string. Use strcmp or better strncmp.
    if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
             doStuff(s1);
like image 75
Kasper Avatar answered Oct 16 '22 17:10

Kasper


Abusing strlen() will dramatically worsen the performance.

for( int i = 0; i < strlen( string ); i++ ) {
    processChar( string[i] );
}

will have at least O(n2) time complexity whereas

int length = strlen( string );
for( int i = 0; i < length; i++ ) {
    processChar( string[i] );
}

will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.

like image 5
sharptooth Avatar answered Oct 16 '22 17:10

sharptooth