Making a real-time audio application with software synthesizers

https://stackoverflow.com/questions/1392725

21-09-2019
|

Question

I'm looking into making some software that makes the keyboard function like a piano (e.g., the user presses the 'W' key and the speakers play a D note). I'll probably be using OpenAL. I understand the basics of digital audio, but playing real-time audio in response to key presses poses some problems I'm having trouble solving.

Here is the problem: Let's say I have 10 audio buffers, and each buffer holds one second of audio data. If I have to fill buffers before they are played through the speakers, then I would would be filling buffers one or two seconds before they are played. That means that whenever the user tries to play a note, there will be a one or two second delay between pressing the key and the note being played.

How do you get around this problem? Do you just make the buffers as small as possible, and fill them as late as possible? Is there some trick that I am missing?

Solution

Most software synthesizers don't use multiple buffers at all.

They just use one single, small ringbuffer that is constantly played.

A high priority thread will as often as possible check the current play-position and fill the free part (e.g. the part that has been played since the last time your thread was running) of the ringbuffer with sound data.

This will give you a constant latency that is only bound by the size of your ring-buffer and the output latency of your soundcard (usually not that much).

You can lower your latency even further:

In case of a new note to be played (e.g. the user has just pressed a key) you check the current play position within the ring-buffer, add some samples for safety, and then re-render the sound data with the new sound-settings applied.

This becomes tricky if you have time-based effects running (delay lines, reverb and so on), but it's doable. Just keep track of the last 10 states of your time based effects every millisecond or so. That'll make it possible to get back 10 milliseconds in time.

OTHER TIPS

With the WinAPI, you can only get so far in terms of latency. Usually you can't get below 40-50ms which is quite nasty. The solution is to implement ASIO support in your app, and make the user run something like Asio4All in the background. This brings the latency down to 5ms but at a cost: other apps can't play sound at the same time.

I know this because I'm a FL Studio user.

The solution is small buffers, filled frequently by a real-time thread. How small you make the buffers (or how full you let the buffer become with a ring-buffer) is constrained by scheduling latency of your operating system. You'll probably find 10ms to be acceptable.

There are some nasty gotchas in here for the uninitiated - particularly with regards to software architecture and thread-safety.

You could try having a look at Juce - which is a cross-platform framework for writing audio software, and in particular - audio plugins such as SoftSynths and effects. It includes software for both sample plug-ins and hosts. It is in the host that issues with threading are mostly dealt with.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow