Question

I have a strange problem.

I have to spawn a process in my node app (external, no fork possible). this child process then sends output I need to get back and store in a db. The way I do it now is to echo each line of my data (It's JSON) and listen for what comes on stdout.

child code :

var cntSent=0
for (var j = 0, lUF = uniqueFlyers.length; j < lUF; j++) {
  var products = uniqueFlyers[j].products;
  for (var k = 0, lP = products.length; k < lP; k++) {
    var pstr = products[k].product;
    this.echo(pstr);
    cntSent+=1;
  }
}
console.log(cntSent);

at the end, cntSent=10000.

Node side :

var cntReceived
proc.stdout.on('data', function(line) {
  cntReceived+=1;
  console.log(line);
});
proc.on('close', function (code) {
  console.log(cntReceived);
});

at the end, cntReceived = 3510.

I can see all my data outputted, but it's aggregated together and comes in big chunks. My idea is to write to a file, then process the file with node, but it seems redundant and I'd like to start processing the data as it comes. Any suggestions as to the most accurate and fast way?

EDIT: as usual, writing down the questions made me think. Am I just being silly and would be better off buffering the data, then parsing it? It's JSON goddamnit!

Was it helpful?

Solution

There is no need to write the data to a file, then process the file; nor do you need to buffer the whole data before processing it.

If the data you're outputting is in JSON, I'd suggest then using JSONStream in the parent code. This will allow you to parse the output on the fly. Below is an example.

The child code will output a JSON array:

// Child code
console.log('['); // We'll output a JSON array
for (var j = 0, lUF = uniqueFlyers.length; j < lUF; j++) {
  var products = uniqueFlyers[j].products;
  for (var k = 0, lP = products.length; k < lP; k++) {
    var pstr = products[k].product;
    console.log(JSON.stringify(pstr)); // output some JSON
    if ((j !== lUF - 1) && (k !== lP - 1))
        console.log(','); // output commas between JSON objects in the array
    cntSent+=1;
  }
}
console.log(']'); // close the array

While the parent code will read this JSON array, and process it. We use the * selector in order to select all elements of the array. The JSONStream will then emit each JSON document one by one, as they are parsed. Once we have this data, we can then use a Writable stream, that will read the JSON objects and then do something (anything!) with them.

// Parent code
var stream = require('stream');
var jsonstream = require('JSONStream').parse('*');
var finalstream = new stream.Writable({ objectMode: true }); // this stream receives objects, not raw buffers or strings
finalstream._write = function (doc, encoding, done) {
    console.log(doc);
    done();
};

proc.stdout.pipe(jsonstream).pipe(finalstream);

OTHER TIPS

var cntReceived
proc.stdout.on('data', function(line) {
  var arr = data.split('\n');
  cnt+=arr.length-1;
  console.log(line);
});
proc.on('close', function (code) {
  console.log(cntReceived);
});

Output : cntReceived = 10000

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top