Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is difference between file reading and streaming?

I read in some book that using streaming is better than reading a whole file at a time in node.js, I understand the idea .. but I wonder isn't file reading using streams, I'm used to this from Java and C++, when I want to read a file I use streams .. So what's the difference here ?? also what is the difference between fs.createReadStream(<somefile>); and fs.readFile(<somefile>); both are asynchronous, right !!

like image 915
OMar Mohamed Avatar asked Jul 19 '16 12:07

OMar Mohamed


2 Answers

First thing is fileread is fully buffered method. and streaming is partial buffered method.

Now what does it mean?

Fully buffered function calls like readFileSync() and readFile() expose the data as one big blob. That is, reading is performed and then the full set of data is returned either in synchronous or asynchronous fashion. With these fully buffered methods, we have to wait until all of the data is read, and internally Node will need to allocate enough memory to store all of the data in memory. This can be problematic - imagine an application that reads a 1 GB file from disk. With only fully buffered access we would need to use 1 GB of memory to store the whole content of the file for reading - since both readFile and readFileSync return a string containing all of the data.

Partially buffered access methods are different. They do not treat data input as a discrete event, but rather as a series of events which occur as the data is being read or written. They allow us to access data as it is being read from disk/network/other I/O.

Streams return smaller parts of the data (using a Buffer), and trigger a callback when new data is available for processing.

Streams are EventEmitters. If our 1 GB file would, for example, need to be processed in some way once, we could use a stream and process the data as soon as it is read. This is useful, since we do not need to hold all of the data in memory in some buffer: after processing, we no longer need to keep the data in memory for this kind of application.

The Node stream interface consists of two parts: Readable streams and Writable streams. Some streams are both readable and writable.

like image 60
Devendra Verma Avatar answered Oct 04 '22 22:10

Devendra Verma


So what's the difference here ?? also what is the difference between fs.createReadStream(<somefile>); and fs.readFile(<somefile>); both are asynchronous, right !!

Well aside from the fact that fs.createReadStream() directly returns a stream object, and fs.readFile() expects a callback function in the second argument, there is another huge difference.

Yes they are both asynchronous, but that doesn't change the fact that fs.readFile() doesn't give you any data until the entire file has been buffered into memory. This is much less memory-efficient and slower when relaying data back through server responses. With fs.createReadStream(), you can pipe() the stream object directly to a server's response object, which means your client can immediately start receiving data even if the file is 500MB.

Not only this, you also improve memory efficiency by dealing with the file one chunk at a time rather than all at once. This means that your memory only has to buffer the file contents a few kilobytes at a time rather than all at once.

Here's two snippets demonstrating what I'm saying:

const fs = require('fs');
const http = require('http');

// using readFile()
http.createServer(function (req, res) {
    // let's pretend this is a huge 500MB zip file
    fs.readFile('some/file/path.zip', function (err, data) {
        // entire file must be buffered in memory to data, which could be very slow
        // entire chunk is sent at once, no streaming here
        res.write(data);
        res.end();
    });
});

// using createReadStream()
http.createServer(function (req, res) {
    // processes the large file in chunks
    // sending them to client as soon as they're ready
    fs.createReadStream('some/file/path.zip').pipe(res);
    // this is more memory-efficient and responsive
});
like image 21
Patrick Roberts Avatar answered Oct 04 '22 21:10

Patrick Roberts