How do you do bicubic (or other non-linear) interpolation of re-sampled audio data?

https://stackoverflow.com/questions/1125666

13-09-2019
|

Question

I'm writing some code that plays back WAV files at different speeds, so that the wave is either slower and lower-pitched, or faster and higher-pitched. I'm currently using simple linear interpolation, like so:

            int newlength = (int)Math.Round(rawdata.Length * lengthMultiplier);
            float[] output = new float[newlength];

            for (int i = 0; i < newlength; i++)
            {
                float realPos = i / lengthMultiplier;
                int iLow = (int)realPos;
                int iHigh = iLow + 1;
                float remainder = realPos - (float)iLow;

                float lowval = 0;
                float highval = 0;
                if ((iLow >= 0) && (iLow < rawdata.Length))
                {
                    lowval = rawdata[iLow];
                }
                if ((iHigh >= 0) && (iHigh < rawdata.Length))
                {
                    highval = rawdata[iHigh];
                }

                output[i] = (highval * remainder) + (lowval * (1 - remainder));
            }

This works fine, but it tends to sound OK only when I lower the frequency of the playback (i.e. slow it down). If I raise the pitch on playback, this method tends to produce high-frequency artifacts, presumably because of the loss of sample information.

I know that bicubic and other interpolation methods resample using more than just the two nearest sample values as in my code example, but I can't find any good code samples (C# preferably) that I could plug in to replace my linear interpolation method here.

Does anyone know of any good examples, or can anyone write a simple bicubic interpolation method? I'll bounty this if I have to. :)

Update: here are a couple of C# implementations of interpolation methods (thanks to Donnie DeBoer for the first one and nosredna for the second):

    public static float InterpolateCubic(float x0, float x1, float x2, float x3, float t)
    {
        float a0, a1, a2, a3;
        a0 = x3 - x2 - x0 + x1;
        a1 = x0 - x1 - a0;
        a2 = x2 - x0;
        a3 = x1;
        return (a0 * (t * t * t)) + (a1 * (t * t)) + (a2 * t) + (a3);
    }

    public static float InterpolateHermite4pt3oX(float x0, float x1, float x2, float x3, float t)
    {
        float c0 = x1;
        float c1 = .5F * (x2 - x0);
        float c2 = x0 - (2.5F * x1) + (2 * x2) - (.5F * x3);
        float c3 = (.5F * (x3 - x0)) + (1.5F * (x1 - x2));
        return (((((c3 * t) + c2) * t) + c1) * t) + c0;
    }

In these functions, x1 is the sample value ahead of the point you're trying to estimate and x2 is the sample value after your point. x0 is left of x1, and x3 is right of x2. t goes from 0 to 1 and is the distance between the point you're estimating and the x1 point.

The Hermite method seems to work pretty well, and appears to reduce the noise somewhat. More importantly it seems to sound better when the wave is sped up.

Solution

My favorite resource for audio interpolating (especially in resampling applications) is Olli Niemitalo's "Elephant" paper.

I've used a couple of these and they sound terrific (much better than a straight cubic solution, which is relatively noisy). There are spline forms, Hermite forms, Watte, parabolic, etc. And they are discussed from an audio point-of-view. This is not just your typical naive polynomial fitting.

And code is included!

To decide which to use, you probably want to start with the table on page 60 which groups the algorithms into operator complexity (how many multiplies, and how many adds). Then choose among the best signal-to-noise solutions--use your ears as a guide to make the final choice. Note: Generally, the higher SNR, the better.

OTHER TIPS

double InterpCubic(double x0, double x1, double x2, double x3, double t)
{
   double a0, a1, a2, a3;

   a0 = x3 - x2 - x0 + x1;
   a1 = x0 - x1 - a0;
   a2 = x2 - x0;
   a3 = x1;

   return a0*(t^3) + a1*(t^2) + a2*t + a3;
}

where x1 and x2 are the samples being interpolated between, x0 is x1's left neighbor, and x3 is x2's right neighbor. t is [0, 1], denoting the interpolation position between x1 and x2.

Honestly, cubic interpolation isn't generally much better for audio than linear. A simple suggestion for improving your linear interpolation would be to use an antialiasing filter (before or after the interpolation, depending on whether you are shortening the signal or lengthening it). Another option (though more computationally expensive) is sinc-interpolation, which can be done with very high quality.

We have released some simple, LGPL resampling code that can do both of these as part of WDL (see resample.h).

You're looking for polynomial interpolation. The idea is that you pick a number of known data points around the point you want to interpolate, compute an interpolated polynomial using the data points, and then find out the value of the polynomial and the interpolation point.

There are other methods. If you can stomach the math, look at signal reconstruction, or google for "signal interpolation".

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow