Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I read/stream a file without loading the entire file into memory?

How can I read an arbitrary file and process it "piece by piece" (meaning byte by byte or some other chunk size that would give the best read performance) without loading the entire file into memory? An example of processing would be to generate an MD5 hash of the file although the answer could apply to any operation.

I'd like to have or write this but if I can get existing code that would be great too.

(c#)

like image 575
Howiecamp Avatar asked Jul 28 '11 21:07

Howiecamp


4 Answers

Here's an example of how to read a file in chunks of 1KB without loading the entire contents into memory:

const int chunkSize = 1024; // read the file by chunks of 1KB
using (var file = File.OpenRead("foo.dat"))
{
    int bytesRead;
    var buffer = new byte[chunkSize];
    while ((bytesRead = file.Read(buffer, 0, buffer.Length)) > 0)
    {
        // TODO: Process bytesRead number of bytes from the buffer
        // not the entire buffer as the size of the buffer is 1KB
        // whereas the actual number of bytes that are read are 
        // stored in the bytesRead integer.
    }
}
like image 60
Darin Dimitrov Avatar answered Nov 16 '22 13:11

Darin Dimitrov


System.IO.FileStream does not load the file into memory.
This stream is seekable and MD5 hashing algorithm doesn't have to load the stream(file) intro memory either.

Please replace file_path with the path to your file.

byte[] hash = null;

using (var stream = new FileStream(file_path, FileMode.Open))
{
    using (var md5 = new System.Security.Cryptography.MD5CryptoServiceProvider())
    {
        hash = md5.ComputeHash(stream);
    }
}

Here, your MD5 Hash will be stored in the hash variable.

like image 42
Vercas Avatar answered Nov 16 '22 15:11

Vercas


   int fullfilesize = 0;// full size of file
    int DefaultReadValue = 10485760; //read 10 mb at a time
    int toRead = 10485760;
    int position =0;

  //  int 
 //   byte[] ByteReadFirst = new byte[10485760];

    private void Button_Click(object sender, RoutedEventArgs e)
    {
        using (var fs = new FileStream(@"filepath", FileMode.Open, FileAccess.Read))
        {
            using (MemoryStream requestStream = new MemoryStream())
            {


                fs.Position = position;

                if (fs.Position >= fullfilesize)
                {
                    MessageBox.Show(" all done");
                    return;
                }
                System.Diagnostics.Debug.WriteLine("file position" + fs.Position);

                if (fullfilesize-position < toRead)
                {
                    toRead = fullfilesize - position;
                    MessageBox.Show("last time");
                }
                System.Diagnostics.Debug.WriteLine("toread" + toRead);
                int    bytesRead;
                byte[] buffer = new byte[toRead];
                int offset = 0;
                position += toRead;
                while (toRead > 0 && (bytesRead = fs.Read(buffer, offset, toRead)) > 0)
                {
                    toRead -= bytesRead;
                    offset += bytesRead;
                }

                toRead = DefaultReadValue;


            }
        }
    }

Copying Darin's , this method will read 10mb chunks till the end of the file

like image 4
Sanath Shetty Avatar answered Nov 16 '22 13:11

Sanath Shetty


const int MAX_BUFFER = 1024;
byte[] Buffer = new byte[MAX_BUFFER];
int BytesRead;
using (System.IO.FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read))
    while ((BytesRead = fileStream.Read(Buffer, 0, MAX_BUFFER)) != 0)
    {
        // Process this chunk starting from offset 0 
        // and continuing for bytesRead bytes!
    }
like image 2
CSharper Avatar answered Nov 16 '22 13:11

CSharper