Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Progressively read binary file in JavaScript

Using Chrome, I am trying to read and process a large (>4GB) binary file on my local disk. It looks like the FileReader API will only read the entire file, but I need to be able to read the file progressively as a stream.

This file contains a sequence of frames containing a 1-byte type identifier, a 2-byte frame length, an 8-byte time stamp, and then some binary data that has a format based on the type. The content of these frames will be accumulated, and I'd like to use HTML5+JavaScript to generate graphs and display other metrics as real-time playback based on the content of this file.

Anybody have any ideas?

like image 446
Krum Avatar asked Mar 23 '23 11:03

Krum


1 Answers

Actually, Files are Blobs, and Blob has a slice method, which we can use to grab smaller chunks of large files.

I wrote the following snip last week to filter large log files, but it shows the pattern you can uses to loop sub-section-by-sub-section through big files.

  1. file is the file object
  2. fnLineFilter is a function that accepts one line of the file and returns true to keep it
  3. fnComplete is a callback where the collected lines are passed as an array

here is the code i used:

 function fileFilter(file, fnLineFilter, fnComplete) {
     var bPos = 0,
         mx = file.size,
         BUFF_SIZE = 262144,
         i = 0,
         collection = [],
         lineCount = 0;
     var d1 = +new Date;
     var remainder = "";

     function grabNextChunk() {

         var myBlob = file.slice(BUFF_SIZE * i, (BUFF_SIZE * i) + BUFF_SIZE, file.type);
         i++;

         var fr = new FileReader();

         fr.onload = function(e) {

             //run line filter:
             var str = remainder + e.target.result,
                 o = str,
                 r = str.split(/\r?\n/);
             remainder = r.slice(-1)[0];
             r.pop();
             lineCount += r.length;

             var rez = r.map(fnLineFilter).filter(Boolean);
             if (rez.length) {
                 [].push.apply(collection, rez);
             } /* end if */

             if ((BUFF_SIZE * i) > mx) {
                 fnComplete(collection);
                 console.log("filtered " + file.name + " in " + (+new Date() - d1) + "ms  ");
             } /* end if((BUFF_SIZE * i) > mx) */
             else {
                 setTimeout(grabNextChunk, 0);
             }

         };
         fr.readAsText(myBlob, myBlob.type);
     } /* end grabNextChunk() */

     grabNextChunk();
 } /* end fileFilter() */

obviously, you can get rid of the line finding, and just grab pure ranges instead; i wasn't sure what type of data you need to dig through and the important thing is the slice mechanics, which are well-demonstrated by the text-focused code above.

like image 102
dandavis Avatar answered Apr 01 '23 04:04

dandavis