I’m writing text to a binary file in C# and see a difference in quantity written between writing a string and a character array. I’m using System.IO.BinaryWriter and watching BinaryWriter.BaseStream.Length as the writes occur. These are my results:
using(BinaryWriter bw = new BinaryWriter(File.Open(“data.dat”), Encoding.ASCII))
{
string value = “Foo”;
// Writes 4 bytes
bw.Write(value);
// Writes 3 bytes
bw.Write(value.ToCharArray());
}
I don’t understand why the string overload writes 4 bytes when I’m writing only 3 ASCII characters. Can anyone explain this?
The documentation for BinaryWriter.Write(string)
states that it writes a length-prefixed string to this stream. The overload for Write(char[])
has no such prefixing.
It would seem to me that the extra data is the length.
EDIT:
Just to be a bit more explicit, use Reflector. You will see that it has this piece of code in there as part of the Write(string)
method:
this.Write7BitEncodedInt(byteCount);
It is a way to encode an integer using the least possible number of bytes. For short strings (that we would use day to day that are less than 128 characters), it can be represented using one byte. For longer strings, it starts to use more bytes.
Here is the code for that function just in case you are interested:
protected void Write7BitEncodedInt(int value)
{
uint num = (uint) value;
while (num >= 0x80)
{
this.Write((byte) (num | 0x80));
num = num >> 7;
}
this.Write((byte) num);
}
After prefixing the the length using this encoding, it writes the bytes for the characters in the desired encoding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With