Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does %[^\n] mean in C?

Tags:

c

string

What does %[^\n] mean in C? I saw it in a program which uses scanf for taking multiple word input into a string variable. I don't understand though because I learned that scanf can't take multiple words.

Here is the code:

#include <stdio.h>
#include <stdlib.h>

int main() {
    char line[100];
    scanf("%[^\n]",line);
    printf("Hello,World\n");
    printf("%s",line);
    return 0;
}
like image 394
Salman Sadi Avatar asked Sep 11 '16 01:09

Salman Sadi


2 Answers

[^\n] is a kind of regular expression.

  • [...]: it matches a nonempty sequence of characters from the scanset (a set of characters given by ...).
  • ^ means that the scanset is "negated": it is given by its complement.
  • ^\n: the scanset is all characters except \n.

Furthermore fscanf (and scanf) will read the longest sequence of input characters matching the format.

So scanf("%[^\n]", s); will read all characters until you reach \n (or EOF) and put them in s. It is a common idiom to read a whole line in C.

See also §7.21.6.2 The fscanf function.

like image 192
md5 Avatar answered Oct 08 '22 02:10

md5


scanf("%[^\n]",line); is a problematic way to read a line. It is worse than gets().

C defines line as:

A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined.

The scanf("%[^\n]", line) has the specifier "%[^\n]". It scans for unlimited number of characters that match the scan-set ^\n. If none are read, the specifier fails and scanf() returns with line unaltered. If at least one character is read, all matching characters are read and saved and a null character is appended.

The scan-set ^\n implies all character that are not (due to the '^') '\n'.


'\n' is not read

scanf("%[^\n]",.... fails to read a new line character '\n'. It remains in stdin. The entire line is not read.

Buffer overflow

The below leads to undefined behavior (UB) should more than 99 characters get read.

char line[100];
scanf("%[^\n]",line);  // buffer overflow possible

Does nothing on empty line

When the line consists of only "\n", scanf("%[^\n]",line); returns a 0 without setting line[] - no null character is appended. This can readily lead to undefined behavior should subsequent code use an uninitialized line[]. The '\n' remains in stdin.

Failure to check the return value

scanf("%[^\n]",line); assumes input succeeded. Better code would check the scanf() return value.


Recommendation

Do not use scanf() and instead use fgets() to read a line of input.

#define EXPECTED_INPUT_LENGTH_MAX 49
char line[EXPECTED_INPUT_LENGTH_MAX + 1  + 1  + 1];
//                                    \n + \0 + extra to detect overly long lines 

if (fgets(line, sizeof line, stdin)) {
  size_t len = strlen(line);
  // Lop off potential trailing \n if desired.
  if (len > 0 && line[len-1] == '\n') {
    line[--len] = '\0';
  }
  if (len > EXPECTED_INPUT_LENGTH_MAX) {
    // Handle error
    // Usually includes reading rest of line if \n not found.
  }

The fgets() approach has it limitations too. e.g. (reading embedded null characters).

Handling user input, possible hostile, is challenging.

like image 22
chux - Reinstate Monica Avatar answered Oct 08 '22 01:10

chux - Reinstate Monica