Continuous waveform audio synthesizer

https://stackoverflow.com/questions/9031386

20-04-2021
|

Question

I'm starting to write a soft synthesizer with a peculiar characteristic: The oscillators will have a "continuous waveform" knob that will allow users to select sine, square and saw tooth waves in a continuous fashion. That is, if the knob is all the way to the left, the output will be a sine wave, if it's in the middle it will be a saw tooth wave, if it's all the way to the right it'll be a square wave and then the intermediate positions will output waves that are "interpolated" versions of the classic waves. -- Knob positions and types of waves could be changed but having a continuous way to change wave forms is desired --

I've thought of a couple of ways to implement the oscillator:

Come up with a function that takes the knob position and calculates the spectrum of the actual signal (an array of amplitudes and frequencies) and then use a bunch of sine functions and a sum block to implement the output signal.
Similar to 1. but apply a reverse Fourier transform instead of the sines and sum (OK, at this point I'm not sure if they are actually the same thing.)
Generate a waveform table for each possible knob position and use a wave table synthesis technique to generate the output signal.
Start with 2 saw tooth waves (they contain both even and odd harmonics), invert one and sum them, and control the amplitude of each one of them with the knob. The wave forms would not be

I have a few questions:

A. I've read that technique number 1 is very processor intensive and not really feasible. Does this hold true for ARM processors such as the one on the iPad??

B. Whatever technique I end up choosing, can the problem of aliasing be resolved simply by connecting a low-pass filter to the output of the oscillator?

C. Any other suggestion on how to implement such an oscillator?

D. Any suggestions on which C++ toolkit to use? I've been looking at the STK from CCRMA but I don't know if there are other more suitable libraries.

Wish me luck! ;)

Edit: Someone pointed me to din last night. Bezier curves are another option to consider.

Solution

I'm not sure you're not over-complicating this. If I understand correctly, all you're doing with your continuous waveform knob is effectively mixing together different amounts of the 3 waveforms. So, just generate all 3 waveforms all the time, then sum them together with different gains according to the mix of waveforms you've described.

For band limited waveform synthesis to avoid aliasing, you'll probably find most of what you need here.

Hope that helps.

OTHER TIPS

Here is an answer for B (Can the problem of aliasing be resolved simply by connecting a low-pass filter to the output?) which touches on some of the other points.

Unfortunately, the answer is 'no'. Aliasing is caused by the presence of harmonic frequencies above the Nyquist frequency (i.e. half the sample rate.) If those harmonics are present in your oscillator's waveform, filtering cannot help. (Suitably aggressive filtering will destroy the character of the waves you've generated.) Oversampling (another answer mentions this) can, but it's expensive.

To avoid this, you have to generate 'band limited' waveforms. That is, waveforms which have no harmonics above some chosen value < Nyquist. Doing this is non-trivial. This paper here is worth a read. There are two established, practical approaches to solving this problem: BLITs (Band Limited Impulse Train) and MinBLEPs. Both approaches try to smooth out harmonic-generating discontinuities by inserting 'stuff' at appropriate points in the waveform.

With that in mind your options start to shrink. Probably the best compromise between ease and sound would be generating a series of band limited wavetables. You'd still need to investigate some form of anti-aliasing to handle the interpolated waves, though.

The iDevice ARM is quite capable of doing DSP in realtime. General advice: write tight code, use inline functions and avoid division. Your render loop is going to be called 44,100 times a second, so, as long as your code completes within 1/44100 sec (0.023ms) you'll have no problems. In practice, you should be able to run several oscillators simultaneously with no issues at all. The presence of all those music apps on the app store is testament to that.

STK is a great intro library. (Perry Cook's book "Real-time Audio Synthesis for Interactive Applications" is also a good grounding and worth a read.) STK purposely not optimised though, and I'm not sure how well it would lend itself to generating your 'continuous' waveforms. kvraudio.com and musicdsp.org should be on your reading list.

The Fourier transform is linear, so taking the FFT of e.g. square and saw waves and crossfading every harmonic linearly and then take it back to the time domain, either by iFFT or summing sines, should give exactly the same output as just crossfading the saw and square signals directly. I'm not sure if that is what you wanted to do, but if it is there's no need to do FFTs or compute intermediate tables.

There are many other ways of smoothly "fading" between waveforms of course - you could use phase distortion, for example, with a distortion curve consisting of linear segments that you move from positions that generate a square to positions that generate a saw. This is probably very tricky to implement in a way that is inherently band-limited.

Aliasing can, in practice, often be resolved using oversampling and filtering, or just filtering. Using band-limited techniques is better since aliasing will always cause some noise, but you can often filter it low enough to be inaudible which is what matters for audio synthesis.

A. I've read that technique number 1 is very processor intensive and not really feasible. Does this hold true for ARM processors such as the one on the iPad?

This makes a simple problem complex (pun intended). Accelerate.framework provides optimized variations of these functions (fwiw), but it's still complicating a simple problem. As a general note: floating point calculations on the devices are slow. A floating point implementation can cost compromise your program considerably. It would likely result in compromising features, polyphony, or quality significantly. Without knowing the reqs, it's hard to say whether you could get by with floating point calcs.

B. Whatever technique I end up choosing, can the problem of aliasing be resolved simply by connecting a low-pass filter to the output of the oscillator?

That wont't work for signals generated in the time domain, unless you oversample.

C. Any other suggestion on how to implement such an oscillator?

see below

D. Any suggestions on which C++ toolkit to use? I've been looking at the STK from CCRMA but I don't know if there are other more suitable libraries.

STK is more like an instructional tool than a toolkit designed for embedded synthesisers. More suitable implementations exist.

Option 1. Come up with a function that takes the knob position and calculates the spectrum of the actual signal (an array of amplitudes and frequencies) and then use a bunch of sine functions and a sum block to implement the output signal.

Option 2. Similar to 1. but apply a reverse Fourier transform instead of the sines and sum (OK, at this point I'm not sure if they are actually the same thing.)

That's relatively slow on desktops.

Option 4. Start with 2 saw tooth waves (they contain both even and odd harmonics), invert one and sum them, and control the amplitude of each one of them with the knob. The wave forms would not be

You could do this quite efficiently (e.g. with a BLIT) for alias free generation. However, BLIT's restricted to a handful of waveforms (you can use it for the Saw and Square). You can look back at history and ask "How did they solve this problem in hardware and software synths ca. 2000". This was one solution. Another was:

Option 3. Generate a waveform table for each possible knob position and use a wave table synthesis technique to generate the output signal.

Considering the device's capabilities, I'd recommend an int implementation of this or the BLIT.

The table's easy to grok and implement, and provides good sound and CPU results. It's also highly configurable for CPU/Memory/Quality tradeoffs.

If you want alias free (or close), go for BLIT (or a relative of). The reason is that you would need a good chunk of memory and a good amount of oversampling for minimum to no audible aliasing with wavetables.

Implementation:

There are numerous BLIT (and family) implementations online.

Here's a napkin-scribbling for tables:

enum { WF_Sine, WF_Saw, WF_Square, WF_COUNT };
enum { TableSize = SomePowerOfTwo };

struct sc_waveform {
    uint32_t at[TableSize];
};

enum { NPitches = Something };

sc_waveform oscs[WF_COUNT][NPitches];

Upon initialization, use additive synthesis to populate oscs.

During playback, use either:

interpolation and oversampling to read from the tables
or a good amount of oversampling of the signal and then downsampling (which is CPU efficient).

For reference: I'd estimate linear interpolation of a table which consumed an irresponsible amount of memory (considering the amount available) without oversampling should keep your alias frequencies at or below -40 dB if you were not to audibly damp the highest partials and you rendered at 44.1kHz. This is a naive brute force approach! You can do better with a little extra work.

Finally, you should also find relevant info if you google "Vector Synthesis" -- what you describe is a primitive form of it.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow