Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computing Hash while saving a file?

Tags:

c#

hash

I have an inputStream that I want to use to compute a hash and save the file to disk. I would like to know how to do that efficiently. Should I use some task to do that concurrently, should I duplicate the stream pass to two streams, one for the the saveFile method and one for thecomputeHash method, or should I do something else?

like image 625
Dave Avatar asked Jun 20 '12 17:06

Dave


3 Answers

What about using a hash algorithms that operate on a block level? You can add the block to the hash (using the TransformBlock) and subsequently write the block to the file foreach block in the stream.

Untested rough shot:

using System.IO;
using System.Security.Cryptography;

...

public byte[] HashedFileWrite(string filename, Stream input)
{
    var hash_algorithm = MD5.Create();

    using(var file = File.OpenWrite(filename))
    {
        byte[] buffer = new byte[4096];
        int read = 0;

        while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
        {
            hash_algorithm.TransformBlock(buffer, 0, read, null, 0);
            file.Write(buffer, 0, read);
        }

        hash_algorithm.TransformFinalBlock(buffer, 0, read);
    }

    return hash_algorithm.Hash;
}
like image 92
Matt Murrell Avatar answered Oct 10 '22 08:10

Matt Murrell


This method will copy and hash with chained streams.

private static byte[] CopyAndHash(string source, string target)
{
    using (var sha512 = SHA512.Create())
    {
        using (var targetStream = File.OpenWrite(target))
        using (var cryptoStream = new CryptoStream(targetStream, sha512, CryptoStreamMode.Write))
        using (var sourceStream = File.OpenRead(source))
        {
            sourceStream.CopyTo(targetStream);
        }

        return sha512.Hash;
    }
}

For a full sample, including cancellation and progress reporting, see https://gist.github.com/dhcgn/da1637277d9456db9523a96a0a34da78

like image 43
hdev Avatar answered Oct 10 '22 07:10

hdev


It might not be the best option, but I would opt to go for Stream descendant/wrapper, the one that would be pass-through for one actually writing the file to the disk.

So:

  • derive from Stream
  • have one member such as Stream _inner; that will be the target stream to write
  • implement Write() and all related stuff
  • in Write() hash the blocks of data and call _inner.Write()

Usage example

Stream s = File.Open("infile.dat");
Stream out = File.Create("outfile.dat");
HashWrapStream hasher = new HashWrapStream(out);
byte[] buffer=new byte[1024];
int read = 0;
while ((read=s.Read(buffer)!=0) 
{
    hasher.Write(buffer);
}
long hash=hasher.GetComputedHash(); // get actual hash
hasher.Dispose();
s.Dispose();
like image 1
Daniel Mošmondor Avatar answered Oct 10 '22 07:10

Daniel Mošmondor