Question

I'm tasked with building a .NET client app to detect silence in a WAV files.

Is this possible with the built-in Windows APIs? Or alternately, any good libraries out there to help with this?

Was it helpful?

Solution

Audio analysis is a difficult thing requiring a lot of complex math (think Fourier Transforms). The question you have to ask is "what is silence". If the audio that you are trying to edit is captured from an analog source, the chances are that there isn't any silence... they will only be areas of soft noise (line hum, ambient background noise, etc).

All that said, an algorithm that should work would be to determine a minimum volume (amplitude) threshold and duration (say, <10dbA for more than 2 seconds) and then simply do a volume analysis of the waveform looking for areas that meet this criteria (with perhaps some filters for millisecond spikes). I've never written this in C#, but this CodeProject article looks interesting; it describes C# code to draw a waveform... that is the same kind of code which could be used to do other amplitude analysis.

OTHER TIPS

http://www.codeproject.com/Articles/19590/WAVE-File-Processor-in-C

This has all the code necessary to strip silence, and mix wave files.

Enjoy.

If you want to efficiently calculate the average power over a sliding window: square each sample, then add it to a running total. Subtract the squared value from N samples previous. Then move to the next step. This is the simplest form of a CIC Filter. Parseval's Theorem tells us that this power calculation is applicable to both time and frequency domains.

Also you may want to add Hysteresis to the system to avoid switching on&off rapidly when power level is dancing about the threshold level.

I'm using NAudio, and I wanted to detect the silence in audio files so I can either report or truncate.

After a lot of research, I came up with this basic implementation. So, I wrote an extension method for the AudioFileReader class which returns the silence duration at the start/end of the file, or starting from a specific position.

Here:

static class AudioFileReaderExt
{
    public enum SilenceLocation { Start, End }

    private static bool IsSilence(float amplitude, sbyte threshold)
    {
        double dB = 20 * Math.Log10(Math.Abs(amplitude));
        return dB < threshold;
    }
    public static TimeSpan GetSilenceDuration(this AudioFileReader reader,
                                              SilenceLocation location,
                                              sbyte silenceThreshold = -40)
    {
        int counter = 0;
        bool volumeFound = false;
        bool eof = false;
        long oldPosition = reader.Position;

        var buffer = new float[reader.WaveFormat.SampleRate * 4];
        while (!volumeFound && !eof)
        {
            int samplesRead = reader.Read(buffer, 0, buffer.Length);
            if (samplesRead == 0)
                eof = true;

            for (int n = 0; n < samplesRead; n++)
            {
                if (IsSilence(buffer[n], silenceThreshold))
                {
                    counter++;
                }
                else
                {
                    if (location == SilenceLocation.Start)
                    {
                        volumeFound = true;
                        break;
                    }
                    else if (location == SilenceLocation.End)
                    {
                        counter = 0;
                    }
                }
            }
        }

        // reset position
        reader.Position = oldPosition;

        double silenceSamples = (double)counter / reader.WaveFormat.Channels;
        double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
        return TimeSpan.FromMilliseconds(silenceDuration);
    }
}

This will accept almost any audio file format not just WAV.

Usage:

using (AudioFileReader reader = new AudioFileReader(filePath))
{
    TimeSpan duration = reader.GetSilenceDuration(AudioFileReaderExt.SilenceLocation.Start);
    Console.WriteLine(duration.TotalMilliseconds);
}

References:

I don't think you'll find any built-in APIs for detection of silence. But you can always use good ol' math/discreete signal processing to find out loudness. Here's a small example: http://msdn.microsoft.com/en-us/magazine/cc163341.aspx

Use Sox. It can remove leading and trailing silences, but you'll have to call it as an exe from your app.

See code below from Detecting audio silence in WAV files using C#

private static void SkipSilent(string fileName, short silentLevel)
{
    WaveReader wr = new WaveReader(File.OpenRead(fileName));
    IntPtr format = wr.ReadFormat();
    WaveWriter ww = new WaveWriter(File.Create(fileName + ".wav"), 
        AudioCompressionManager.FormatBytes(format));
    int i = 0;
    while (true)
    {
        byte[] data = wr.ReadData(i, 1);
        if (data.Length == 0)
        {
            break;
        }
        if (!AudioCompressionManager.CheckSilent(format, data, silentLevel))
        {
            ww.WriteData(data);
        }
    }
    ww.Close();
    wr.Close();
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top