Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading from ifstream won't read whitespace

Tags:

c++

c++11

I'm implementing a custom lexer in C++ and when attempting to read in whitespace, the ifstream won't read it out. I'm reading character by character using >>, and all the whitespace is gone. Is there any way to make the ifstream keep all the whitespace and read it out to me? I know that when reading whole strings, the read will stop at whitespace, but I was hoping that by reading character by character, I would avoid this behaviour.

Attempted: .get(), recommended by many answers, but it has the same effect as std::noskipws, that is, I get all the spaces now, but not the new-line character that I need to lex some constructs.

Here's the offending code (extended comments truncated)

while(input >> current) {     always_next_struct val = always_next_struct(next);     if (current == L' ' || current == L'\n' || current == L'\t' || current == L'\r') {         continue;     }     if (current == L'/') {         input >> current;         if (current == L'/') {             // explicitly empty while loop             while(input.get(current) && current != L'\n');             continue;         } 

I'm breaking on the while line and looking at every value of current as it comes in, and \r or \n are definitely not among them- the input just skips to the next line in the input file.

like image 682
Puppy Avatar asked Jul 21 '11 10:07

Puppy


People also ask

What does STD ifstream do?

std::ifstream Objects of this class maintain a filebuf object as their internal stream buffer, which performs input/output operations on the file they are associated with (if any). File streams are associated with files either on construction, or by calling member open .

How do you use Noskipws?

std::noskipws This flag can be set with the skipws manipulator. When set, as many initial whitespace characters as necessary are read and discarded from the stream until a non-whitespace character is found. This would apply to every formatted input operation performed with operator>> on the stream.


2 Answers

There is a manipulator to disable the whitespace skipping behavior:

stream >> std::noskipws; 
like image 147
R. Martinho Fernandes Avatar answered Oct 03 '22 17:10

R. Martinho Fernandes


The operator>> eats whitespace (space, tab, newline). Use yourstream.get() to read each character.

Edit:

Beware: Platforms (Windows, Un*x, Mac) differ in coding of newline. It can be '\n', '\r' or both. It also depends on how you open the file stream (text or binary).

Edit (analyzing code):

After

  while(input.get(current) && current != L'\n');   continue; 

there will be an \n in current, if not end of file is reached. After that you continue with the outmost while loop. There the first character on the next line is read into current. Is that not what you wanted?

I tried to reproduce your problem (using char and cin instead of wchar_t and wifstream):

//: get.cpp : compile, then run: get < get.cpp  #include <iostream>  int main() {   char c;    while (std::cin.get(c))   {     if (c == '/')      {        char last = c;        if (std::cin.get(c) && c == '/')       {         // std::cout << "Read to EOL\n";         while(std::cin.get(c) && c != '\n'); // this comment will be skipped         // std::cout << "go to next line\n";         std::cin.putback(c);         continue;       }      else { std::cin.putback(c); c = last; }     }     std::cout << c;   }   return 0; } 

This program, applied to itself, eliminates all C++ line comments in its output. The inner while loop doesn't eat up all text to the end of file. Please note the putback(c) statement. Without that the newline would not appear.

If it doesn't work the same for wifstream, it would be very strange except for one reason: when the opened text file is not saved as 16bit char and the \n char ends up in the wrong byte...

like image 32
René Richter Avatar answered Oct 03 '22 18:10

René Richter