Draw sound wave with possibility to zoom in/out

https://stackoverflow.com/questions/2305299

21-09-2019
|

Question

I'm writing a sound editor for my graduation. I'm using BASS to extract samples from MP3, WAV, OGG etc files and add DSP effects like echo, flanger etc. Simply speaching I made my framework that apply an effect from position1 to position2, cut/paste management.

Now my problem is that I want to create a control similar with this one from Cool Edit Pro that draw a wave form representation of the song and have the ability to zoom in/out select portions of the wave form etc. After a selection i can do something like:

TInterval EditZone = WaveForm->GetSelection();

where TInterval have this form:

struct TInterval
{
    long Start;
    long End;
}

I'm a beginner when it comes to sophisticated drawing so any hint on how to create a wave form representation of a song, using sample data returned by BASS, with ability to zoom in/out would be appreciated.

I'm writing my project in C++ but I can understand C#, Delphi code so if you want you can post snippets in last two languages as well :)

Thanx DrOptix

Solution

By Zoom, I presume you mean horizontal zoom rather than vertical. The way audio editors do this is to scan the wavform breaking it up into time windows where each pixel in X represents some number of samples. It can be a fractional number, but you can get away with dis-allowing fractional zoom ratios without annoying the user too much. Once you zoom out a bit the max value is always a positive integer and the min value is always a negative integer.

for each pixel on the screen, you need to have to know the minimum sample value for that pixel and the maximum sample value. So you need a function that scans the waveform data in chunks and keeps track of the accumulated max and min for that chunk.

This is slow process, so professional audio editors keep a pre-calculated table of min and max values at some fixed zoom ratio. It might be at 512/1 or 1024/1. When you are drawing with a zoom ration of > 1024 samples/pixel, then you use the pre-calculated table. if you are below that ratio you get the data directly from the file. If you don't do this you will find that you drawing code gets to be too slow when you zoom out.

Its worthwhile to write code that handles all of the channels of the file in an single pass when doing this scanning, slowness here will make your whole program feel sluggish, it's the disk IO that matters here, the CPU has no trouble keeping up, so straightforward C++ code is fine for building the min/max tables, but you don't want to go through the file more than once and you want to do it sequentially.

Once you have the min/max tables, keep them around. You want to go back to the disk as little as possible and many of the reasons for wanting to repaint your window will not require you to rescan your min/max tables. The memory cost of holding on to them is not that high compared to the disk io cost of building them in the first place.

Then you draw the waveform by drawing a series of 1 pixel wide vertical lines between the max value and the min value for the time represented by that pixel. This should be quite fast if you are drawing from pre built min/max tables.

OTHER TIPS

I've recently done this myself. As Marius suggests you need to work out how many samples are at each column of pixels. You then work out the minimum and maximum and then plot a vertical line from the maximum to the minimum.

As a first pass this seemingly works fine. The problem you'll get is that as you zoom out it will start to take too long to retrieve the samples from disk. As a solution to this I built a "peak" file alongside the audio file. The peak file stores the minimum/maximum pairs for groups of n samples. PLaying with n till you get the right amount is up to uyou. Personally I found 128 samples to be a good tradeoff between size and speed. Its also worth remembering that, unless you are drawing a control larger than 65536 pixels in size that you needn't store this peak information as anything more than 16-bit values which saves a bit of space.

Wouldn't you just plot the sample points on a 2 canvas? You should know how many samples there are per second for a file (read it from the header), and then plot the value on the y axis. Since you want to be able to zoom in and out, you need to control the number of samples per pixel (the zoom level). Next you take the average of those sample points per pixel (for example take the average of every 5 points if you have 5 samples per pixel. Then you can use a 2d drawing api to draw lines between the points.

Using the open source NAudio Package -

public class WavReader2
{
    private readonly WaveFileReader _objStream;

    public WavReader2(String sPath)
    {
        _objStream = new WaveFileReader(sPath);
    }

    public List<SampleRangeValue> GetPixelGraph(int iSamplesPerPixel)
    {
        List<SampleRangeValue> colOutputValues = new List<SampleRangeValue>();

        if (_objStream != null)
        {
            _objStream.Position = 0;
            int iBytesPerSample = (_objStream.WaveFormat.BitsPerSample / 8) * _objStream.WaveFormat.Channels;
            int iNumPixels = (int)Math.Ceiling(_objStream.SampleCount/(double)iSamplesPerPixel);

            byte[] aryWaveData = new byte[iSamplesPerPixel * iBytesPerSample];
            _objStream.Position = 0; // startPosition + (e.ClipRectangle.Left * iBytesPerSample * iSamplesPerPixel);

            for (float iPixelNum = 0; iPixelNum < iNumPixels; iPixelNum += 1)
            {
                short iCurrentLowValue = 0;
                short iCurrentHighValue = 0;
                int iBytesRead = _objStream.Read(aryWaveData, 0, iSamplesPerPixel * iBytesPerSample);
                if (iBytesRead == 0)
                    break;

                List<short> colValues = new List<short>();
                for (int n = 0; n < iBytesRead; n += 2)
                {
                    short iSampleValue = BitConverter.ToInt16(aryWaveData, n);
                    colValues.Add(iSampleValue);
                }

                float fLowPercent =  (float)((float)colValues.Min() /ushort.MaxValue);
                float fHighPercent = (float)((float)colValues.Max() / ushort.MaxValue);

                colOutputValues.Add(new SampleRangeValue(fHighPercent, fLowPercent));
            }
        }

        return colOutputValues;
    }
}

public struct SampleRangeValue
{
    public float HighPercent;
    public float LowPercent;
    public SampleRangeValue(float fHigh, float fLow)
    {
        HighPercent = fHigh;
        LowPercent = fLow;
    }
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow