Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove spaces from a string in C

Tags:

c

string

spaces

What is the easiest and most efficient way to remove spaces from a string in C?

like image 689
Tyler Treat Avatar asked Nov 13 '09 00:11

Tyler Treat


4 Answers

Easiest and most efficient don't usually go together…

Here's a possible solution for in-place removal:

void remove_spaces(char* s) {     char* d = s;     do {         while (*d == ' ') {             ++d;         }     } while (*s++ = *d++); } 
like image 79
Aaron Avatar answered Sep 21 '22 22:09

Aaron


Here's a very compact, but entirely correct version:

do while(isspace(*s)) s++; while(*d++ = *s++); 

And here, just for my amusement, are code-golfed versions that aren't entirely correct, and get commenters upset.

If you can risk some undefined behavior, and never have empty strings, you can get rid of the body:

while(*(d+=!isspace(*s++)) = *s); 

Heck, if by space you mean just space character:

while(*(d+=*s++!=' ')=*s); 

Don't use that in production :)

like image 43
Kornel Avatar answered Sep 23 '22 22:09

Kornel


As we can see from the answers posted, this is surprisingly not a trivial task. When faced with a task like this, it would seem that many programmers choose to throw common sense out the window, in order to produce the most obscure snippet they possibly can come up with.

Things to consider:

  • You will want to make a copy of the string, with spaces removed. Modifying the passed string is bad practice, it may be a string literal. Also, there are sometimes benefits of treating strings as immutable objects.
  • You cannot assume that the source string is not empty. It may contain nothing but a single null termination character.
  • The destination buffer can contain any uninitialized garbage when the function is called. Checking it for null termination doesn't make any sense.
  • Source code documentation should state that the destination buffer needs to be large enough to contain the trimmed string. Easiest way to do so is to make it as large as the untrimmed string.
  • The destination buffer needs to hold a null terminated string with no spaces when the function is done.
  • Consider if you wish to remove all white space characters or just spaces ' '.
  • C programming isn't a competition over who can squeeze in as many operators on a single line as possible. It is rather the opposite, a good C program contains readable code (always the single-most important quality) without sacrificing program efficiency (somewhat important).
  • For this reason, you get no bonus points for hiding the insertion of null termination of the destination string, by letting it be part of the copying code. Instead, make the null termination insertion explicit, to show that you haven't just managed to get it right by accident.

What I would do:

void remove_spaces (char* restrict str_trimmed, const char* restrict str_untrimmed)
{
  while (*str_untrimmed != '\0')
  {
    if(!isspace(*str_untrimmed))
    {
      *str_trimmed = *str_untrimmed;
      str_trimmed++;
    }
    str_untrimmed++;
  }
  *str_trimmed = '\0';
}

In this code, the source string "str_untrimmed" is left untouched, which is guaranteed by using proper const correctness. It does not crash if the source string contains nothing but a null termination. It always null terminates the destination string.

Memory allocation is left to the caller. The algorithm should only focus on doing its intended work. It removes all white spaces.

There are no subtle tricks in the code. It does not try to squeeze in as many operators as possible on a single line. It will make a very poor candidate for the IOCCC. Yet it will yield pretty much the same machine code as the more obscure one-liner versions.

When copying something, you can however optimize a bit by declaring both pointers as restrict, which is a contract between the programmer and the compiler, where the programmer guarantees that the destination and source are not the same address. This allows more efficient optimization, since the compiler can then copy straight from source to destination without temporary memory in between.

like image 39
Lundin Avatar answered Sep 21 '22 22:09

Lundin


In C, you can replace some strings in-place, for example a string returned by strdup():

char *str = strdup(" a b c ");

char *write = str, *read = str;
do {
   if (*read != ' ')
       *write++ = *read;
} while (*read++);

printf("%s\n", str);

Other strings are read-only, for example those declared in-code. You'd have to copy those to a newly allocated area of memory and fill the copy by skipping the spaces:

char *oldstr = " a b c ";

char *newstr = malloc(strlen(oldstr)+1);
char *np = newstr, *op = oldstr;
do {
   if (*op != ' ')
       *np++ = *op;
} while (*op++);

printf("%s\n", newstr);

You can see why people invented other languages ;)

like image 20
Andomar Avatar answered Sep 23 '22 22:09

Andomar