Question

I'm trying to understand something I'm seeing in PyAudio. I'm trying to take a 200ms sample of audio, wait a few seconds, and then take three more 200ms samples of audio. Consider this code:

import pyaudio
import time

p = pyaudio.PyAudio()
chunk = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1 #chan
RATE = 11025
stream = p.open(format = FORMAT,
            channels = CHANNELS,
            rate = RATE,
            input = True,
            output = False,
            frames_per_buffer = chunk,
            input_device_index = 0,
            output_device_index = 0)

def record(seconds):
all = []
for i in range(0, int(RATE / chunk * seconds)):
    data = stream.read(chunk)

    all.append(data)
data = ''.join(all)
return data

#record 200ms of sound
print "pre-record 1 " + str(time.time())
data = record(.2)
print "post-record 1 " + str(time.time())

#sleep for one second
#time.sleep(1)

#record 200ms of sound
print "pre-record 2 " + str(time.time())
data = record(.2)
print "post-record 2 " + str(time.time())

print "pre-record 3 " + str(time.time())
data = record(.2)
print "post-record 3 " + str(time.time())

print "pre-record 4 " + str(time.time())
data = record(.2)
print "post-record 4 " + str(time.time())

If I run this "as-is" (i.e. with time.sleep() commented out), I get this, which makes sense:

pre-record 1 1357526364.46
post-record 1 1357526364.67
pre-record 2 1357526364.67
post-record 2 1357526364.86
pre-record 3 1357526364.86
post-record 3 1357526365.03
pre-record 4 1357526365.03
post-record 4 1357526365.22

If I then uncomment the time.sleep() line to add a one second delay between the first and second recording, I get this:

pre-record 1 1357525897.09
post-record 1 1357525897.28
pre-record 2 1357525898.28
post-record 2 1357525898.28
pre-record 3 1357525898.28
post-record 3 1357525898.28
pre-record 4 1357525898.28
post-record 4 1357525898.47

Notice that while there is a one-second delay in the timestamps between the first and second recordings, there is no delay at all between recordings 2 and 3, and they appear to have been taken in zero time (although they actually do contain 200ms of data). Recording four appears to have taken 200ms to get, but was started at the same time as recordings 2 and 3.

If I replace time.sleep(1) with record(1) (and not save/use the data from the 1-second recording), the program behaves as I expect:

pre-record 1 1357526802.57
post-record 1 1357526802.77
pre-record 2 1357526803.69
post-record 2 1357526803.88
pre-record 3 1357526803.88
post-record 3 1357526804.06
pre-record 4 1357526804.06
post-record 4 1357526804.25

So it would appear that even during the time.sleep(1), a buffer somewhere is still being fed audio, and when I call the record function after the sleep, it's grabbing the audio from that buffer, which is not the audio I want. I need the audio after the sleep, not during it. Can someone help me understand PyAudio's interaction with whatever buffer is there, and is there a way to better know at what time my audio was actually captured?

Was it helpful?

Solution

Audio device will add RATE samples per second to the buffer continually, and your code read from this buffer.

So you can't sleep(1) without reading from buffer. You can write some code to skip samples:

def skip(seconds):
    samples = int(seconds * RATE)
    count = 0
    while count < samples:
        stream.read(chunk)
        count += chunk
        time.sleep(0.01)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top