Question

I need to monitor my audio line-in in linux, and in the event that audio is played, the sound must be recorded and saved to a file. Similiar to how motion monitors the video feed.

Is it possible to do this with bash? something along the lines of:

#!/bin/bash

# audio device
device=/dev/audio-line-in

# below this threshold audio will not be recorded.
noise_threshold=10

# folder where recordings are stored
storage_folder=~/recordings

# run indefenitly, until Ctrl-C is pressed
while true; do
   # noise_level() represents a function to determine
   # the noise level from device
   if noise_level( $device ) > $noise_threshold; then
     # stream from device to file, can be encoded to mp3 later.
     cat $device > $storage_folder/$(date +%FT%T).raw         
   fi;
done;

EDIT: The flow I'd like to get from this program is

 a. when noise > threshold, start recording  
 b. stop recording when noise < threshold for 10 seconds
 c. save recorded piece to separate file
Was it helpful?

Solution

SoX is the Swiss Army knife of sound processing. You can utilize it to analyze recordings. The only shortcoming of the folowing solutions is:

  1. You need to split up the recordings to fixed size chunks
  2. You can lose recording time (due to killing/analyzing/restarting of recordings)

So further improvements could be doing analyzing asynchronous, although this will complicate the job.

#!/bin/bash 

record_interval=5
noise_threshold=3
storage_folder=~/recordings

exec 2>/dev/null        # no default  error output
while true; do 
    rec out.wav &
    sleep $record_interval
    kill -KILL %1
    max_level="$(sox  out.wav -n stats -s 16 2>&1|awk '/^Max\ level/ {print int($3)}')"
    if [ $max_level -gt $noise_threshold ];then 
    mv out.wav ${storage_folder}/recording-$(date +%FT%T).wav;
    else 
    rm out.wav
    fi
done

Update:

The following solution uses a fifo as output from rec. By using split on this pipe to get the chunks, there should be no loss of recording time:

#!/bin/bash 

noise_threshold=3
storage_folder=~/recordings
raw_folder=~/recordings/tmp
split_folder=~/recordings/split
sox_raw_options="-t raw -r 48k -e signed -b 16"
split_size=1048576 # 1M

mkdir -p ${raw_folder} ${split_folder}

test -a ${raw_folder}/in.raw ||  mkfifo ${raw_folder}/in.raw

# start recording and spliting in background
rec ${sox_raw_options} - >${raw_folder}/in.raw 2>/dev/null &
split -b ${split_size} - <${raw_folder}/in.raw ${split_folder}/piece &


while true; do 
    # check each finished raw file
    for raw in $(find ${split_folder} -size ${split_size}c);do 
    max_level="$(sox $sox_raw_options  ${raw} -n stats -s 16 2>&1|awk '/^Max\ level/ {print int($3)}')"
    if [ $max_level -gt $noise_threshold ];then 
        sox ${sox_raw_options} ${raw} ${storage_folder}/recording-$(date +%FT%T).wav;
    fi
    rm ${raw}
    done
    sleep 1
done1

OTHER TIPS

Here's an even better one;

sox -t alsa default ./recording.flac silence 1 0.1 5% 1 1.0 5%

It produces an audio file, only when there is sound, and cuts out the silence. So no gaps and no long silences like the stuff above!

Here's a sketch of how to improve on Jürgen's solution: it's just double-buffering, so while you are analyzing one file you have already started recording the next. My guess it that this trick will reduce gaps to the order of 100 milliseconds, but you would have to do some experiments to find out.

Completely untested!

#!/bin/bash 

record_interval=5
noise_threshold=3
storage_folder=~/recordings

exec 2>/dev/null        # no default  error output

function maybe_save { # out.wav date
    max_level="$(sox "$1" -n stats -s 16 2>&1|
                 awk '/^Max\ level/ {print int($3)}')"
    if [ $max_level -gt $noise_threshold ]; then 
      mv "$1" ${storage_folder}/recording-"$2"
    else 
      rm "$1"
    fi
}

i=0
while true; do 
    this=out$i.wav
    rec $this &
    pid=$?
    if [ $i -gt 9 ]; then i=0; else i=$(expr $i + 1); fi
    archive=$(date +%FT%T).wav;
    sleep $record_interval
    kill -TERM $pid
    maybe_save $this $archive &
done

The key is that the moment you kill the recording process, you launch analysis in the background and then take another trip around the loop to record the next fragment. Really you should launch the next recording process first, then the analysis, but that will make the control flow a bit uglier. I'd measure first to see what kinds of skips you're getting.

rec -c CHANNELS -r RATE -b BITS -n OUTPUT.AUDIOTYPE noisered NOISEREDUCTION.noise-profile silence 1 5 1% 1 1t 1%

This will monitor the default microphone input continuously until a sound is heard that exceeds 1% of the background noise reduced profile, then output a file of AUDIOTYPE (mp4, flac, wav, raw, etc.) at the RATE hz, BITS, CHANNELS. Recording will stop after 1 second of silence as measured at 1% of noise reduced levels. The output file will be cleaned of background noise (mostly).

Now, if someone can just tell me how to determine that the recording has stopped programmatically, I can make it useful for continuous monitoring voice recognition.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top