Question

I am facing following problem:

I recorded sound with Pyaudio and saved it as Wav. The Wav file is 48000hz (No other Rate works (sampling rate error but thats an other story)) The Wav file sounds good , now i want to convert the wav to flac to sent it to the google speech api.

Problem is avconf converts my 48khz input wav to an 8khz flac(with -ar 48000). The flac file is just white noise , i have tried verry much but even google has no answer ;)

Note:it worked for me fine with an other microphone with 16Khz no problems at all. Neither with Pyaudios Sampling rate error nor the avconv problem.

Here is The code:

Recording:

   chunk = 2048
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 48000
THRESHOLD = 525 #The threshold intensity that defines silence signal (lower than).
SILENCE_LIMIT = 3 #Silence limit in seconds. The max ammount of seconds where only silence is recorded. When this time passes the recording finishes and the file is delivered.

#open stream
p = pyaudio.PyAudio()

stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                frames_per_buffer = chunk)

print "* listening. CTRL+C to finish manually."
all_m = []
data = ''
rel = RATE/chunk
slid_win = deque(maxlen=SILENCE_LIMIT*rel)
started = False

while (True):
    data = stream.read(chunk)
    slid_win.append (abs(audioop.avg(data, 2)))

    if(True in [ x>THRESHOLD for x in slid_win]):
        if(not started):
            print "starting record"
        started = True
        all_m.append(data)
    elif (started==True):
        print "finished"
        #the limit was reached, finish capture and deliver
        filename = save_speech(all_m,p)
        result=stt_google_wav(filename)
        #reset all
        started = False
        #slid_win = deque(maxlen=SILENCE_LIMIT*rel)
        #all_m= []
        print "Google STT Done"
        stream.close()
        p.terminate()
        return result

AND:

def save_speech(data, p):
filename = 'output_'+str(int(time.time()))
# write data to WAVE file
data = ''.join(data)
wf = wave.open(filename+'.wav', 'wb')
wf.setnchannels(1)
wf.setsampwidth(p.get_sample_size(pyaudio.paInt16))
wf.setframerate(48000)
wf.writeframes(data)
wf.close()
print "finished saving wav: %s" % filename
return filename

To Convert to Flac:

os.system("avconv -i "+ filename+".wav  -y -ar 48000 "+ filename+ ".flac")

EDIT 1:

The Flac is actually 48khz , i dont know why mplayer shows me that the flac is 8khz , i played it on my pc and the flac is perfect, anyway the google api seems to have problems with that , because it returns nothing. I assume that the white noise problem of the mplayer on the Rasberry is connected to the Problem with the google Api but i have no idea what it could be.

Wav File:

output_1385413929.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 48000 Hz

Flac File:

output_1385413929.flac: FLAC audio bitstream data, 16 bit, mono, 48 kHz, 204800 samples

Solved: I dont know why , i turned on my pi and wanted to test around and suddenly It worked without changing anything.

Ty for your help. Greetings from germany, Flo

Was it helpful?

Solution

I agree - works down the line for me:

me@raspberrypi /mnt/share/Audio/xxxxxx $ file sample_audio.wav 
sample_audio.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 8000 Hz
me@raspberrypi /mnt/share/Audio/xxxxxx $ file sample_audio.flac 
sample_audio.flac: FLAC audio bitstream data, 16 bit, stereo, 48 kHz, 9131406 samples
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top