Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scanf on an istream object

Tags:

c++

c

c++11

NOTE: I've seen the post What is the cin analougus of scanf formatted input? before asking the question and the post doesn't solve my problem here. The post seeks for C++-way to do it, but as I mentioned already, it is inconvenient to just use C++-way to do it sometimes and I have clear examples for that.

I am trying to read data from an istream object, and sometimes it is inconvenient to just use C++-style ways such as operator>>, e.g. the data are in special form 123:456 so you have to imbue to make ':' as space (which is very hacky, as opposed to %d:%d in scanf), or 00123 where you want to read as string and convert decimal instead of octal (as opposed to %d in scanf), and possibly many other cases.

The reason I chose istream as interface is because it can be derived and therefore more flexible. For example, we can create in-memory streams, or some customized streams that generated on the fly, etc. C-style FILE*, on the other hand, is very limited, at least in a standard-compliant way, on creating customized streams.

So my questions is, is there a way to do scanf-like data extraction on istream object? I think fscanf internally read character by character from FILE* using fgetc, while istream also provides such interface. So it is possible by just copying and pasting the code of fscanf and replace the FILE* with the istream object, but that's very hacky. Is there a smarter and cleaner way, or is there some existing work on this?

Thanks.

like image 356
Kan Li Avatar asked Jun 19 '14 08:06

Kan Li


People also ask

Can we use scanf in CPP?

The scanf() function in C++ is used to read the data from the standard input ( stdin ). The read data is stored in the respective variables. It is defined in the cstdio header file.

How do I scanf a string in CPP?

Just use scanf("%s", stringName); or cin >> stringName; tip: If you want to store the length of the string while you scan the string, use this : scanf("%s %n", stringName, &stringLength); stringName is a character array/string and strigLength is an integer.

Is Cin slower than scanf?

With synchronization turned off, the above results indicate that cin is 8-10% faster than scanf(). This is probably because scanf() interprets the format arguments at runtime and makes use of variable number of arguments, whereas cin does this at compile time.

Is Cin an istream object?

The C++ cin is an istream class predefined object. It is linked to a standard input device, i.e., a keyboard. To read input from a console, the cin is used in combination with the stream extraction operator (>>).


2 Answers

You should never, under any circumstances, use scanf or its relatives for anything, for three reasons:

  1. Many format strings, including for instance all the simple uses of %s, are just as dangerous as gets.
  2. It is almost impossible to recover from malformed input, because scanf does not tell you how far in characters into the input it got when it hit something unexpected.
  3. Numeric overflow triggers undefined behavior: yes, that means scanf is allowed to crash the entire program if a numeric field in the input has too many digits.

Prior to C++11, the C++ specification defined istream formatted input of numbers in terms of scanf, which means that last objection is very likely to apply to them as well! (In C++11 the specification is changed to use strto* instead and to do something predictable if that detects overflow.)

What you should do instead is: read entire lines of input into std::string objects with getline, hand-code logic to split them up into fields (I don't remember off the top of my head what the C++-string equivalent of strsep is, but I'm sure it exists) and then convert numeric strings to machine numbers with the strtol/strtod family of functions.

I cannot emphasize this enough: THE ONLY 100% RELIABLE WAY TO CONVERT STRINGS TO NUMBERS IN C OR C++, unless you are lucky enough to have a C++ runtime that is already C++11-conformant in this regard, IS WITH THE strto* FUNCTIONS, and you must use them correctly:

errno = 0;
result = strtoX(s, &ends, 10); // omit 10 for floats
if (s == ends || *ends || errno)
  parse_error();

(The OpenBSD manpages, linked above, explain why you have to do this fairly convoluted thing.)

(If you're clever, you can use ends and some manual logic to skip that colon, instead of strsep.)

like image 115
zwol Avatar answered Oct 04 '22 04:10

zwol


I do not recommend you to mix C++ input output and C input output. No that they are really incompatible but they could just plain interoperate wrong.

For example Oracle docs recommend not to mix it http://www.oracle.com/technetwork/articles/servers-storage-dev/mixingcandcpluspluscode-305840.html

But no one stops you from reading data into the buffer and parsing it with standard c functions like sscanf.

...
string curString;
int a, b;
...

std::getline(inputStream, curString);

int sscanfResult == sscanf(curString.cstr(), "%d:%d", &a, &b);

if (2 != sscanfResult)
   throw "error";
...

But it won't help in some situations when your stream is just one long contiguous sequence of symbols(like some string turned into memory stream).

Making your own fscanf from scratch or porting(?) the original CRT function actually isn't the worst possible idea. Just make sure you have tested it thoroughly(low level custom char manipulation was always a source of pain in C).

I've never really tried the boost\spirit and such parsing infrastructure could really be an overkill for your project. But boost libraries are usually well tested and designed. You could at least try to use it.

like image 41
Eugene Podskal Avatar answered Oct 04 '22 04:10

Eugene Podskal