Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js streams for XML transformations with xml-stream

I'm using xml-stream to read a large XML file. I'd like to:

  1. pipe collected elements to a stream
  2. optionally, to transform those elements using one or more pipes
  3. pipe the result to an http response

Here is xml-stream snippet collecting required elements:

xml.on('endElement: item', function(item) {
  // pipe item to stream
})

How do I build streams for step 1 and 2?

P.S. xml-stream has only console.log examples

UPDATE 1

Here is what I wrote so far:

stream = require('stream');

let liner = new stream.Transform( { objectMode: true } );

liner._transform = function (data, encoding, done) {
  this.push(data);
  console.log(data);
  console.log('======================='); 
  done();
};

let fileStream = fs.createReadStream(fileNames[0]);

let xmlStream = new XmlStream(fileStream);

let counter = 0;

xmlStream.on('endElement: Item', function(el) {
  liner.write(el);
  counter += 1;
});

xmlStream.on('end', function() {
  console.log(counter);
  liner.end();
});

_transform get called on every write, however piping liner stream to http result doesn't produce any output.

like image 559
krl Avatar asked Jun 18 '15 09:06

krl


People also ask

What is stream in Nodejs in what cases stream should be used?

A stream is an abstract interface for working with streaming data in Node.js. The node:stream module provides an API for implementing the stream interface. There are many stream objects provided by Node.js. For instance, a request to an HTTP server and process.stdout are both stream instances.

What is stream passthrough in Nodejs?

This sort of stream is a trivial implementation of a Transform stream, which simply passes received input bytes through to an output stream. This is useful if one doesn't require any transformation of the input data, and simply wants to easily pipe a Readable stream to a Writable stream.


1 Answers

Mission accomplished. The function below returns transform stream that can be piped to any writable stream. liner._flush is necessary only you want to add some data in the end of the stream.

P.S. A handy module (not used here) https://github.com/rvagg/through2

const fs = require('fs');
const stream = require('stream');
const XmlStream = require('xml-stream');

function getTransformStream() { 

  let liner = new stream.Transform( { objectMode: true } );

  liner._transform = function (data, encoding, done) {
    // have your transforms here
    this.push(data);
    console.log(data);
    console.log('=======================');
    done();
  };

  liner._flush = function (done) {
    console.log('DONE DONE DONE DONE');
    done();
  };


  let fileStream = fs.createReadStream('filename');

  let xmlStream = new XmlStream(fileStream);

  let counter = 0;

  xmlStream.on('endElement: Item', function(el) {
    liner.write(JSON.stringify(el));
    counter += 1;
  });

  xmlStream.on('end', function() {
    console.log(counter);
    liner.end();
  });

  return liner;
}
like image 62
krl Avatar answered Oct 03 '22 06:10

krl