Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert characters in a c string to their escape sequences

Tags:

c

escaping

I need a function like string ToLiteral(string input) from this post. Such that

char *literal = to_literal("asdf\r\n");

Would yield literal ==> "asdf\\r\\n".

I've googled around, but not been able to find anything (guess that I've must be using the wrong terms). However, I assume that a library with this functionality must be out there somewhere...

Thank you for the interresting answers. Googling "c string escape function" by the way seems to be the key to obtaining even more examples and GLIB provides g_strescape () which seems to be exactly what I need.

like image 640
Christian Madsen Avatar asked Aug 20 '10 21:08

Christian Madsen


People also ask

What is use of '\ r escape sequence in C?

\r is a carriage return character; it tells your terminal emulator to move the cursor at the start of the line. The cursor is the position where the next characters will be rendered. So, printing a \r allows to override the current line of the terminal emulator.

How do you escape characters in a string?

In the platform, the backslash character ( \ ) is used to escape values within strings. The character following the escaping character is treated as a string literal.

What is escape sequence in C give example?

An escape sequence in C language is a sequence of characters that doesn't represent itself when used inside string literal or character. It is composed of two or more characters starting with backslash \. For example: \n represents new line.


2 Answers

There's no built-in function for this, but you could whip one up:

/* Expands escape sequences within a C-string
 *
 * src must be a C-string with a NUL terminator
 *
 * dest should be long enough to store the resulting expanded
 * string. A string of size 2 * strlen(src) + 1 will always be sufficient
 *
 * NUL characters are not expanded to \0 (otherwise how would we know when
 * the input string ends?)
 */

void expand_escapes(char* dest, const char* src) 
{
  char c;

  while (c = *(src++)) {
    switch(c) {
      case '\a': 
        *(dest++) = '\\';
        *(dest++) = 'a';
        break;
      case '\b': 
        *(dest++) = '\\';
        *(dest++) = 'b';
        break;
      case '\t': 
        *(dest++) = '\\';
        *(dest++) = 't';
        break;
      case '\n': 
        *(dest++) = '\\';
        *(dest++) = 'n';
        break;
      case '\v': 
        *(dest++) = '\\';
        *(dest++) = 'v';
        break;
      case '\f': 
        *(dest++) = '\\';
        *(dest++) = 'f';
        break;
      case '\r': 
        *(dest++) = '\\';
        *(dest++) = 'r';
        break;
      case '\\': 
        *(dest++) = '\\';
        *(dest++) = '\\';
        break;
      case '\"': 
        *(dest++) = '\\';
        *(dest++) = '\"';
        break;
      default:
        *(dest++) = c;
     }
  }

  *dest = '\0'; /* Ensure nul terminator */
}

Note that I've left out translation of an escape sequence for the "escape" character, since this isn't standardized in C (some compilers use \e and others use \x). You can add in whichever applies to you.

If you want a function that allocates your destination buffer for you:

/* Returned buffer may be up to twice as large as necessary */
char* expand_escapes_alloc(const char* src)
{
   char* dest = malloc(2 * strlen(src) + 1);
   expand_escapes(dest, src);
   return dest;
}
like image 95
Tyler McHenry Avatar answered Oct 05 '22 18:10

Tyler McHenry


I think I'd do the conversion something like this:

// warning: untested code.
void make_literal(char const *input, char *output) { 
    // the following two arrays must be maintained in matching order:
    static char inputs[] = "\a\b\f\n\r\t\v\\\"\'";
    static char outputs[] = "abfnrtv\\\"\'";

    char *p, *pos;

    for (;*input;input++) {
        if (NULL!= (pos=strchr(inputs, *input))) {
            *output++ = '\\';
            *output++ = outputs[pos-inputs];
        }
        else
            *output++ = *input;
    }
    *output = '\0';
}

In theory, this could be a bit slower than (for one example) Tyler McHenry's code. In particular, his use of a switch statement allows (but doesn't require) constant time selection of the correct path. In reality, given the sparsity of the values involved, you probably won't get constant time selection, and the string involved is so short that the difference will normally be quite small in any case. In the other direction, I'd expect this to be easier to maintain (e.g., if you want to support more escape sequences, adding them should be pretty easy as long as the form remains constant).

like image 25
Jerry Coffin Avatar answered Oct 05 '22 18:10

Jerry Coffin