This is something that should be very simple. I just want to read numbers and words from a text file that consists of tokens separated by white space. How do you do this in C#? For example, in C++, the following code would work to read an integer, float, and word. I don't want to have to use a regex or write any special parsing code.
ifstream in("file.txt"); int int_val; float float_val; string string_val; in >> int_val >> float_val >> string_val; in.close();
Also, whenever a token is read, no more than one character beyond the token should be read in. This allows further file reading to depend on the value of the token that was read. As a concrete example, consider
string decider; int size; string name; in >> decider; if (decider == "name") in >> name; else if (decider == "size") in >> size; else if (!decider.empty() && decider[0] == '#') read_remainder_of_line(in);
Parsing a binary PNM file is also a good example of why you would like to stop reading a file as soon as a full token is read in.
Brannon's answer explains how to read binary data. If you want to read text data, you should be reading strings and then parsing them - for which there are built-in methods, of course.
For example, to read a file with data:
10 10.5 hello
You might use:
using (TextReader reader = File.OpenText("test.txt")) { int x = int.Parse(reader.ReadLine()); double y = double.Parse(reader.ReadLine()); string z = reader.ReadLine(); }
Note that this has no error handling. In particular, it will throw an exception if the file doesn't exist, the first two lines have inappropriate data, or there are less than two lines. It will leave a value of null
in z
if the file only has two lines.
For a more robust solution which can fail more gracefully, you would want to check whether reader.ReadLine()
returned null
(indicating the end of the file) and use int.TryParse
and double.TryParse
instead of the Parse
methods.
That's assuming there's a line separator between values. If you actually want to read a string like this:
10 10.5 hello
then the code would be very similar:
using (TextReader reader = File.OpenText("test.txt")) { string text = reader.ReadLine(); string[] bits = text.Split(' '); int x = int.Parse(bits[0]); double y = double.Parse(bits[1]); string z = bits[2]; }
Again, you'd want to perform appropriate error detection and handling. Note that if the file really just consisted of a single line, you may want to use File.ReadAllText
instead, to make it slightly simpler. There's also File.ReadAllLines
which reads the whole file into a string array of lines.
EDIT: If you need to split by any whitespace, then you'd probably be best off reading the whole file with File.ReadAllText
and then using a regular expression to split it. At that point I do wonder how you represent a string containing a space.
In my experience you generally know more about the format than this - whether there will be a line separator, or multiple values in the same line separated by spaces, etc.
I'd also add that mixed binary/text formats are generally unpleasant to deal with. Simple and efficient text handling tends to read into a buffer, which becomes problematic if there's binary data as well. If you need a text section in a binary file, it's generally best to include a length prefix so that just that piece of data can be decoded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With