What is a correct formula of amplifying WaveForm audio?

https://stackoverflow.com/questions/6037947

14-11-2019
|

Вопрос

I am wondering what a correct formula of amplifying WaveForm audio is from C++.

Let's say there's a 16 bit waveform data following: 0x0000 0x2000, 0x3000, 0x2000, 0x0000, (negative part), ...

Due to acoustic reason, just doubled the number won't make twice bigger audio like this: 0x0000 0x4000, 0x6000, 0x4000, 0x0000, (doubled negative part), ...

If there's someone who knows well about audio modification, please let me know.

Решение

If you double all the sample values it will sure sound "twice as loud", that is, 6dB louder. Of course, you need to be careful to avoid distortion due to clipping - that's the main reason why all professional audio processing software today uses float samples internally.

You may need to get back to integer when finally outputting the sound data. If you're just writing a plugin for some DAW (as I would recommend if you want to do program simple yet effective sound FX), it will do all this stuff for you: you just get a float, do something with it, and output a float again. But if you want to, for instance, directly output a .wav file, you need to first limit the output so that everything above 0dB (which is +-1 in a usual float stream) is clipped to just +-1. Then you can multiply by the maximum your desired integer type can reach -1, and just cast it into that type. Done.

Anyway, you're certainly right in that it's important to scale your volume knob logarithmically rather than linear (many consumer-grade programs don't, which is just stupid because you will end up using values very close to the left end of the knobs range most of the time), but that has nothing to do with the amplification calculation itself, it's just because we perceive the loudness of signals on a logarithmic scale. Still, the loudness itself is determined by a simple multiplication of a constant factor of the sound pressure, which in turn is proportional to the voltage in the analog circuitry and to the values of the digital samples in any DSP.

Another thing: I don't know how far you're intending to go, but if you want do do this really properly you should not just clip away peaks that are over 0dB (the clipping sounds very harsh), but implement a proper compressor/limiter. This would then automatically prevent clipping by reducing the level at the loudest parts. You don't want to overdo this either (popular music is usually over-compressed anyway, as a result a lot of the dynamic musical expression is lost), but it is still a "less dangerous" way of increasing the audio level.

Другие советы

I used linear multiplication for it every time and it never failed. It even worked for fade-outs for example...

float amp=1.2;
short sample;
short newSample=(short)amp*sample;

If you want your fade out to be linear, in a sample processing loop do

amp-=0.03;

and if you want to be logarithmic, in a sample processing loop do

amp*=0.97;

until amp reaches some small value (amp < 0.1)

This just may be a perception problem. Your ears (and eyes - look up gamma w.r.t. video), don't perceive loudness in a linear response to the input. A good model of it is that your ears respond to perceive a ln(n) increase for a n increase in volume. Look up the difference between linear pots and audio pots.

Anyway, I don't know if that matters here because your output amp may account for that, but if you want it to be perceived twice as loud you may have to make it e^2 times as loud. Which may mean you're in the realm of clipping now.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow