Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Error: OK" when using fs.readFile() in Node.js (after some iteration of about a hundred thousand)?

Tags:

node.js

I'm "walking" a hundred thousand JSON files, reading the content and throwing an error if something bad happens:

walk(__dirname + '/lastfm_test', 'json', function (err, files) {
    files.forEach(function (filePath) {
        fs.readFile(filePath, function (err, data) {
            if (err) throw err;
        });
    });
});

The walk function is largely inspired by this question (chjj answer). After some iteration, the line if (err) throw err gets executed. the error throw is:

Error: OK, open 'path/to/somejsonfile.json'

Any chance to investigate what's happening here? I'm sure that the walk function is ok: in fact replacing the call fs.readFile() with console.log(filePath) shows the paths. without errors.

Some useful info: Windows 7 x64, node.exe x64 .0.10.5. Last.fm dataset downloaded from here.

like image 398
gremo Avatar asked Feb 17 '23 14:02

gremo


2 Answers

I recommend using the graceful-fs module for this purpose. It will automatically limit the number of open file descriptors. It's written by Isaac Schlueter, the creator of npm and maintainer of Node, so it's pretty solid. The bare fs module lets you shoot yourself in the foot.

like image 55
Myrne Stol Avatar answered Mar 05 '23 17:03

Myrne Stol


The "foreach-loop" is executing readFile very often. NodeJS starts opening the files in a background thread. But no file is processed in the NodeJS main thread until the foreach loop is finished (and all file open requests are scheduled). For this reason no files are processed (and later closed) while opening all files. At some time point many files are opened and all available handles are used, resulting in the useless error message.

Their are multiple soulutions to your problem:

First you could open all files synchronously after each other. But this would slow down the application and would not match the event based programming model of NodeJS. (But is the easiest solution if you don't mind the performance)

Better would be opening only a specific amount of files at a time (e.g. ~1000 files) and after processing one you could open the next one.

Pseude Code:

1. walk the file system and store all file name in an array
2. fs.readFile for a batch of files from the array
3. In the callback of readFile after processing, start opening more files from the array if not empty.
like image 22
Fox32 Avatar answered Mar 05 '23 17:03

Fox32