Managing lots of callback recursion in Nodejs
-
05-07-2019 - |
Question
In Nodejs, there are virtually no blocking I/O operations. This means that almost all nodejs IO code involves many callbacks. This applies to reading and writing to/from databases, files, processes, etc. A typical example of this is the following:
var useFile = function(filename,callback){
posix.stat(filename).addCallback(function (stats) {
posix.open(filename, process.O_RDONLY, 0666).addCallback(function (fd) {
posix.read(fd, stats.size, 0).addCallback(function(contents){
callback(contents);
});
});
});
};
...
useFile("test.data",function(data){
// use data..
});
I am anticipating writing code that will make many IO operations, so I expect to be writing many callbacks. I'm quite comfortable with using callbacks, but I'm worried about all the recursion. Am I in danger of running into too much recursion and blowing through a stack somewhere? If I make thousands of individual writes to my key-value store with thousands of callbacks, will my program eventually crash?
Am I misunderstanding or underestimating the impact? If not, is there a way to get around this while still using Nodejs' callback coding style?
Solution
None of the code you show is using recursion. When you call useFile
it calls posix.stat()
, which returns, and useFile
terminates as it has run to completion. At some later time, when the call to posix.stat()
has completed within the underlying system and the results are available, the callback function you added for that will be executed. That calls posix.open()
, and then terminates as it has run to completion. Once the file has been successfully opened, the callback function for that will execute, calling posix.read()
, and will then terminate as it, too, has run to completion. Finally, when the results of the read are available, the innermost function will be executed.
The important point is that each function runs to completion, as the calls to the posix.*()
functions are non-blocking: that is, they return immediately, having caused some magic to be started off in the underlying system. So each of your functions terminates, and later an event will cause the next function to execute; but at no point is there any recursion.
The nested structure of the code can give one the impression that the stuff inside will have to finish before the stuff outside can get to its own end point. But in this style of asynchronous event-driven programming it makes more sense to see the nesting in terms of deeper => happens-later-than.
EDIT: Try adding some logging statements immediately before the end of each nested function; this will help to illustrate that the order in which they complete is from the outside inwards.
OTHER TIPS
Same example, with debug output added (see below for output):
usefile.js:
var sys = require("sys"),
posix = require("posix");
var useFile = function(filename,callback){
posix.stat(filename).addCallback(function (stats) {
posix.open(filename, process.O_RDONLY, 0666).addCallback(function (fd) {
posix.read(fd, stats.size, 0).addCallback(function(contents){
callback(contents);
sys.debug("useFile callback returned");
});
sys.debug("read returned");
});
sys.debug("open returned");
});
sys.debug("stat returned");
};
useFile("usefile.js",function(){});
Output:
DEBUG: stat returned
DEBUG: open returned
DEBUG: read returned
DEBUG: useFile callback returned
You can try
http://github.com/creationix/do
or roll your own like I did. Never mind missing error handling for now (just ignore that) ;)
var sys = require('sys');
var Simplifier = exports.Simplifier = function() {}
Simplifier.prototype.execute = function(context, functions, finalFunction) {
this.functions = functions;
this.results = {};
this.finalFunction = finalFunction;
this.totalNumberOfCallbacks = 0
this.context = context;
var self = this;
functions.forEach(function(f) {
f(function() {
self.totalNumberOfCallbacks = self.totalNumberOfCallbacks + 1;
self.results[f] = Array.prototype.slice.call(arguments, 0);
if(self.totalNumberOfCallbacks >= self.functions.length) {
// Order the results by the calling order of the functions
var finalResults = [];
self.functions.forEach(function(f) {
finalResults.push(self.results[f][0]);
})
// Call the final function passing back all the collected results in the right order
finalFunction.apply(self.context, finalResults);
}
});
});
}
And a simple example using it
// Execute
new simplifier.Simplifier().execute(
// Context of execution
self,
// Array of processes to execute before doing final handling
[function(callback) {
db.collection('githubusers', function(err, collection) {
collection.find({}, {limit:30}, function(err, cursor) {
cursor.toArray(function(err, users) { callback(users); })
});
});
},
function(callback) {
db.collection('githubprojects', function(err, collection) {
collection.find({}, {limit:45, sort:[['watchers', -1]]}, function(err, cursor) {
cursor.toArray(function(err, projects) { callback(projects); })
});
});
}
],
// Handle the final result
function(users, projects) {
// Do something when ready
}
);
Your stuff is fine. I do recursive calls in Express to follow HTTP redirects, but what your doing is "traversal" and not recursion
Also take a look at 'step' (http://github.com/creationix/step) or 'flow-js' on github. This lets you write callback flows in a more natural style. This will also make it clear that there's no recursion going on.
As with any JavaScript, it's possible to make recursive calls with Node.js. If you do run into recursion depth problems (as NickFitz points out, you don't seem to be in danger of that), you can often rewrite your code to use an interval timer instead.