Assume that a large file is saved on disk and I want to run a computation on every chunk of data contained in the file.

The C/C++ code that I would write to do so would load part of the file, then do the processing, then load the next part, then do the processing of this next part, and so on.

If I am, however, interested to do so in the shortest possible time, I could actually do the following: First, tell DMA-controller to load first part of the file. When this part is loaded tell the DMA-controller to load the second part (in some other part of the memory) and then immediately start processing the first part.

If I get an interrupt from the DMA during processing the first part, I finish the first part and afterwards tell the DMA to overwrite it with the third part of the file; then I process the second part.

If I do not get an interrupt from the DMA during processing the first part, I finish the first part and wait for the interrupt of the DMA.

Depending of how long the processing takes in relation to the disk-read, this should be up to twice as fast. In reality, of course, one would have to measure. But that is not the question I am asking.

The question is: Is it possible to do this a) in C using some non-standard extension or b) in assembly? Or do operating systems not allow such things in general? The question is meant primarily in a single-thread context, although I also would be interested to know how to do it with two threads. Also, I am interested in specific code; this is more of a theoretical question.

有帮助吗?

解决方案

You're right that you will not get the benefit of this by default, because a blocking read stops your thread from doing any processing. Hans is right that modern OSes already take care of all the little details of DMA and interrupt completion routines.

You need to use the architecture you've described, of issuing a request in advance of when you will use the data. Issue asynchronous I/O requests (on Windows these are called OVERLAPPED). Then the flow will go exactly as you envisions, but the DMA and interrupts are handled in the drivers.

On Windows, take a look at FILE_FLAG_OVERLAPPED (to CreateFile) and ReadFile (if you like events) or ReadFileEx (if you like callbacks). If you don't have to process the data in any particular order, then add a completion port to the mix, which queues the completion responses.

On Linux, OSX, and many other Unix-like OSes, look at aio_read. Or fadvise. Or use mmap with madvise.

And you can get these benefits without even writing native code. .NET recently added the ReadAsync method to its FileStream, which can be used with continuation-passing style in the form of Task objects, with async/await syntactic sugar in the C# compiler.

其他提示

Typically, in a multi-mode (user/system) operating system, you do not have access to direct dma or to interrupts. In systems that extend those features from kernel(system) mode down to user mode, the overhead eliminates the benefit of using them.

Ignoring that what you're asking to do requires a very specialized environment to support it, the idea is sound and common: declaring two (or more) buffers to enable DMA to the next while you process the first. When two buffers are used they're sometimes referred to as ping-pong buffers.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top