Question

According to this tutorial asynchronous disk file io can easily be achieved using AIO on linux, at least from a programming/api point-of-view. But before and after this tutorial I had read a lot of posts and articles that this either can not be done or you should use libevent with a patch and many other issues. Another thing was the loop that I should have waited for a signal, but based on this tutorial I can use a callback mechanism, which obviously makes AIO much easier to use.

Now, I am not a linux programmer even by a long shot I just wanted to find a straightforward way to support asynchronous disk file io on linux, learn it and add it to a async disk io library that I need for a personal project. Currently I'm using overlapped io on windows and io worker threads on non-windows platforms. Since the mutithreaded solution can be tricky, I wanted to replace it on linux with AIO.

SO, what is wrong with AIO as described in this tutorial? Is it performance? Is there a restriction on operations that can be done using AIO?

p.s. I don't care if the code will not be portable to other POSIX-compliant platforms, as long as it works on major linux distributions. And all I care about is regular disk file io.

Thanks.

Was it helpful?

Solution

The tutorial gives an overview of asynchronous I/O in general and talks about how there is kernel support for it. Then it goes on to talk about posix AIO (which is the standardized API for accessing asynchronous I/O), implying that using the posix AIO API on linux, will give you access to the kernel support for AIO. This is not the case.

On linux, there are really two separate AIO implementations:

  1. kernel AIO which uses io_submit() et al.) which is only supported in kernel 2.6 (or really 2.5 and there may be back-ported versions of it to 2.4.
  2. posix AIO which is a glibc feature, essentially unrelated to the kernel. It implements the posix API in terms of user-level threads making blocking disk I/O calls.

So, in short, if you already have a generic implementation of multiple threads for disk I/O, you might be better off using that than to use glibc's implementation (because you might have slightly more control over it).

If you're committed to actually use the io_submit() family of functions, you may have to do quite a lot of work to circumvent the restrictions on those functions.

kernel AIO requires your files to be opened with O_DIRECT. Which in turn requires all your file offsets, read and write sizes to be aligned to blocks on the disk. This is typically fine if you're just using one large file and you can make it work very similar to the page cache in the OS. For reading and writing arbitrary files at arbitrary offsets and lengths, it gets messy.

If you end up giving kernel AIO a shot, I would highly recommend looking into tying one or more eventfds to your iocbs so that you can wait on completions using epoll/select rather than having to block in io_getevents().

OTHER TIPS

The Linux implementation of POSIX AIO spawns a thread for every write you do. This is usually not good, and you're better off using your own worker threads to do the writes, so that you can control how many threads are in play. In other words, stick with what you have, AIO isn't going to buy you anything.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top