Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of FileStream's Write vs WriteByte on an IEnumerable<byte>

I need to write bytes of an IEnumerable<byte> to a file.
I can convert it to an array and use Write(byte[]) method:

using (var stream = File.Create(path))
    stream.Write(bytes.ToArray());

But since IEnumerable doesn't provide the collection's item count, using ToArray is not recommended unless it's absolutely necessary.

So I can just iterate the IEnumerable and use WriteByte(byte) in each iteration:

using (var stream = File.Create(path))
    foreach (var b in bytes)
        stream.WriteByte(b);

I wonder which one will be faster when writing lots of data.

I guess using Write(byte[]) sets the buffer according to the array size so it would be faster when it comes to arrays.

My question is when I just have an IEnumerable<byte> that has MBs of data, which approach is better? Converting it to an array and call Write(byte[]) or iterating it and call WriteByte(byte) for each?

like image 486
Şafak Gür Avatar asked Dec 31 '25 09:12

Şafak Gür


1 Answers

Enumerating over a large stream of bytes is a process that adds tons of overhead to something that is normally cheap: Copying bytes from one buffer to the next.

Normally, LINQ-style overhead does not matter much but when it comes to processing 100 million bytes per second on a normal hard drive you will notice severe overheads. This is not premature optimization. We can foresee that this will be a performance hotspot so we should eagerly optimize.

So when copying bytes around you probably should not rely on abstractions like IEnumerable and IList at all. Pass around arrays or ArraySegement<byte>'s which also contain Offset and Count. This frees you from slicing arrays too often.

One thing that is a death-sin with high-throughput IO, too, is calling a method per byte. Like reading bytewise and writing bytewise. This kills performance because these methods have to be called hundreds of millions of times per second. I have experienced that myself.

Always process entire buffers of at least 4096 bytes at a time. Depending on what media you are doing IO with you can use much larger buffers (64k, 256k or even megabytes).

like image 59
usr Avatar answered Jan 02 '26 00:01

usr