I am loading > 220K rows into a sqlite3 db. Each row is stored in a separate file, hence > 220K files.
fs.readdir(dir, {}, (err, files) => {
files.forEach(file => {
fs.readFile(path.join(dir, file), 'utf8', (err, data) => {
//.. process file and insert into db ..
});
});
});
The above causes the Error: EMFILE: too many open files
error. From what I understand, I shouldn't have to close the files because apparently fs.readFile
operates on a file and the closes it for me. I am on Mac OS X, and my ulimit is set to 8192
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited
What can I do to get past this error?
Solution
You can solve this issue by queuing up the readFile operations as soon there is an EMFILE error, and only executing reads after something was closed, luckily, this is exactly what graceful-fs does, so simply replacing the fs module with graceful-fs will fix your issue
const fs = require('graceful-fs');
Problem
Due to the async nature of node, your process is trying to open more files than allowed (8192) so it produces an error. Each iteration in your loop starts reading a file and then it immediately continues with the next iteration.
To read them, the files are being opened, but are not being closed until the read succeeds or fails.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With