Should I use the async File IO methods over their synchronous equivalents for local files in node.js?

StackOverflow https://stackoverflow.com/questions/16827373

  •  30-05-2022
  •  | 
  •  

Question

I have a very simple utility script that I've written in JavaScript for node.js which reads a file, does a few calculations, then writes an output file. The source in its current form looks something like this:

fs.readFile(inputPath, function (err, data) {
    if (err) throw err;
    // do something with the data
    fs.writeFile(outputPath, output, function (err) {
        if (err) throw err;
        console.log("File successfully written.");
    });
});

This works fine, but I'm wondering if there is any disadvantage in this case to using the synchronous variety of these functions instead, like this:

var data = fs.readFileSync(inputPath);
// do something with the data
fs.writeFileSync(outputPath, output);
console.log("File successfully written.");

To me, this is much simpler to read and understand than the callback variety. Is there any reason to use the former method in this case?

I realize that speed isn't an issue at all with this simple script I'm running locally, but I'm interested in understanding the theory behind it. When does using the async methods help, and when does it not? Even in a production application, if I'm only reading a file, then waiting to perform the next task, is there any reason to use the asynchronous method?

Was it helpful?

Solution

What matters is what ELSE your node process needs to do while the synchronous IO happens. In the case of a simple shell script that is run at the command line by a single user, synchronous IO is totally fine since if you were doing asychronous IO all you'd be doing is waiting for the IO to come back anyway.

However, in a network service with multiple users you can NEVER use ANY synchronous IO calls (which is kind of the whole point of node, so believe me when I say this). To do so will cause ALL connected clients to halt processing and it is complete doom.

Rule of thumb: shell script: OK, network service: verboten!

For further reading, I made several analogies in this answer.

Basically, when node does asynchronous IO in a network server, it can ask the OS to do many things: read a few files, make some DB queries, send out some network traffic, and while waiting for that async IO to be ready, it can do memory/CPU things in the main event thread. Using this architecture, node gets pretty good performance/concurrency. However, when a synchronous IO operation happens, the entire node process just blocks and does absolutely nothing. It just waits. No new connections can be received. No processing happens, no event loop ticks, no callbacks, nothing. Just 1 synchronous operation stalls the entire server for all clients. You must not do it at all. It doesn't matter how fast it is or anything like that. It doesn't matter local filesystem or network request. Even if you spend 10ms reading a tiny file from disk for each client, if you have 100 clients, client 100 will wait a full second while that file is read one at a time over and over for clients 1-99.

OTHER TIPS

Asynchronous code does not block the flow of execution, allowing your program to perform other tasks while waiting for an operation to complete.

In the first example, your code can continue running without waiting for the file to be written. In your second example, the code execution is "blocked" until the file is written. This is why synchronous code is known as "blocking" while asynchronous is known as "non-blocking."

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top