My file reader api code has been working good so far until one day I got a 280MB txt file from one of my client. Page just crashes straight up in Chrome and in Firefox nothing happens.
// create new reader object
var fileReader = new FileReader();
// read the file as text
fileReader.readAsText( $files[i] );
fileReader.onload = function(e)
{ // read all the information about the file
// do sanity checks here etc...
$timeout( function()
{
// var fileContent = e.target.result;
// get the first line
var firstLine = e.target.result.slice(0, e.target.result.indexOf("\n") ); }}
What I am trying to do above is that get the first line break so that I can get the column length of the file. Should I not read it as text ? How can I get the column length of the file without breaking the page on big files?
FileReader can only access the contents of files that the user has explicitly selected, either using an HTML <input type="file"> element or by drag and drop. It cannot be used to read a file by pathname from the user's file system. To read files on the client's file system by pathname, use the File System Access API.
The FileReader methods work asynchronously but don't return a Promise. And attempting to retrieve the result immediately after calling a method will not work, as the . onload event handler fires only after the FileReader has successfully finished reading the file and updates the FileReader's . result property.
The FileReader result property returns the file's contents. This property is only valid after the read operation is complete, and the format of the data depends on which of the methods was used to initiate the read operation.
Introduction to the JavaScript FileReader API And JavaScript uses the FileList object to hold the File objects. To read the content of a file, you use the FileReader object. Note that the FileReader only can access the files you selected via drag & drop or file input.
Your application is failing for big files because you're reading the full file into memory before processing it. This inefficiency can be solved by streaming the file (reading chunks of a small size), so you only need to hold a part of the file in memory.
A File
objects is also an instance of a Blob
, which offers the .slice
method to create a smaller view of the file.
Here is an example that assumes that the input is ASCII (demo: http://jsfiddle.net/mw99v8d4/).
function findColumnLength(file, callback) { // 1 KB at a time, because we expect that the column will probably small. var CHUNK_SIZE = 1024; var offset = 0; var fr = new FileReader(); fr.onload = function() { var view = new Uint8Array(fr.result); for (var i = 0; i < view.length; ++i) { if (view[i] === 10 || view[i] === 13) { // \n = 10 and \r = 13 // column length = offset + position of \r or \n callback(offset + i); return; } } // \r or \n not found, continue seeking. offset += CHUNK_SIZE; seek(); }; fr.onerror = function() { // Cannot read file... Do something, e.g. assume column size = 0. callback(0); }; seek(); function seek() { if (offset >= file.size) { // No \r or \n found. The column size is equal to the full // file size callback(file.size); return; } var slice = file.slice(offset, offset + CHUNK_SIZE); fr.readAsArrayBuffer(slice); } }
The previous snippet counts the number of bytes before a line break. Counting the number of characters in a text consisting of multibyte characters is slightly more difficult, because you have to account for the possibility that the last byte in the chunk could be a part of a multibyte character.
There is a awesome library called Papa Parse that do that in a graceful way! It can really handle big files and also you can use web worker.
Just try out the demos that they provide: https://www.papaparse.com/demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With