Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Node module for an async JSON parser that does not load the entire JSON string into memory?

I realize that there are a ton of Node modules that provide an async API for parsing JSON, but many of them seem to read the entire file or stream into memory, construct a giant string, and then pass it to JSON.parse(). This is what the second answer to "How to parse JSON using NodeJS?" suggests, and is exactly what the jsonfile module does.

Constructing a giant string is exactly what I want to avoid. I want an API like:

parseJsonFile(pathToJsonFile): Promise

where the Promise that is returned resolves to the parsed JSON object. This implementation should use a constant amount of memory. I'm not interested in any sort of SAX-like thing that broadcasts events as various pieces are parsed: just the end result.

I think jsonparse may do what I want (it clearly includes logic for parsing JSON without using JSON.parse()), but there is no simple example in the README.md, and the one file in the examples directory seems overly complicated.

like image 759
bolinfest Avatar asked Oct 17 '14 06:10

bolinfest


People also ask

How do I parse a large JSON file in node JS?

With stream-json, we can use the NodeJS file stream to process our large data file in chucks. const StreamArray = require( 'stream-json/streamers/StreamArray'); const fs = require('fs'); const jsonStream = StreamArray. withParser(); //internal Node readable stream option, pipe to stream-json to convert it for us fs.

How do I load a JSON file in node?

To load the data from customer. json file, we will use fs. readFile , passing it the path to our file, an optional encoding type, and a callback to receive the file data. If the file is successfully read, the contents will be passed to the callback.


1 Answers

I've written a module that does this: BFJ (Big-Friendly JSON). It exports a bunch of functions that operate at different levels of abstraction, but are all asynchronous and streaming at their core.

At the highest level are two functions for reading from and writing to the file system, bfj.read and bfj.write. They each return a promise, so you call them like this:

var bfj = require('bfj');

// Asynchronously read from a JSON file on disk
bfj.read(path)
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

// Asynchronously write to a JSON file on disk
bfj.write(path, data)
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

Also at this level is a function for serializing data to a JSON string, called bfj.stringify:

// Asynchronously serialize data to a JSON string
bfj.stringify(data)
  .then(json => {
    // :)
  })
  .catch(error => {
    // :(
  });

Beneath those are two more generic functions for reading from and writing to streams, bfj.parse and bfj.streamify. These serve as foundations for the higher level functions, but you can also call them directly:

// Asynchronously parse JSON from a readable stream
bfj.parse(readableStream).
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

// Asynchronously serialize data to a writable stream of JSON
bfj.streamify(data).
  .pipe(writableStream);

At the lowest level there are two functions analagous to SAX parsers/serializers, bfj.walk and bfj.eventify. It's unlikely you'd want to call these directly, they're just the guts of the implementation for the higher levels.

It's open-source and MIT-licensed. For more information, check the readme.

like image 62
Phil Booth Avatar answered Nov 02 '22 05:11

Phil Booth