Synchronizing audio and video with OpenCV and PyAudio

Question 1

I think you'd be better off using either GSreamer or ffmpeg, or if you're on Windows, DirectShow. These libs can handle both audio and video, and should have some kind of a Multiplexer to allow you to mix video and audio properly.

But if you really want to do this using Opencv, you should be able to use VideoCapture to get the frame rate, have you tried using this?

fps = cv.GetCaptureProperty(vc, CV_CAP_PROP_FPS)

Another way would be to estimate fps as number of frames divided by duration:

nFrames  = cv.GetCaptureProperty(vc, CV_CAP_PROP_FRAME_COUNT)
           cv.SetCaptureProperty(vc, CV_CAP_PROP_POS_AVI_RATIO, 1)
duration = cv.GetCaptureProperty(vc, CV_CAP_PROP_POS_MSEC)
fps = 1000 * nFrames / duration;

I'm not sure I understand what you were trying to do here:

before_read = time.time()
rval, frame = vc.read()
after_read  = time.time()

It seems to me that doing after_read - before_read only measures how long it took for OpenCV to load the next frame, it doesn't measure the fps. OpenCV is not trying to do playback, it's only loading frames and it'll try to do so the fastest it can and I think there's no way to configure that. I think that putting a waitKey(1/fps) after displaying each frame will achieve what you're looking for.

Question 2

You could have 2 counters 1 for audio and one for video. The video counter will become +(1/fps) when showing an image and audio +sec where sec the seconds of audio you are writing to the stream each time. Then on audio part of the code you can do something like While audiosec-videosec>=0.05: # Audio is ahead time.sleep(0.05)

And on video part While videosec-audiosec>=0.2:# video is ahead time.sleep(0.2)

You can play with the numbers

This is how i achieve some sort of synchronization on my own video player project using pyaudio recently ffmpeg instead of cv2.

Question 3

personally i used threading for this.

import concurrent.futures
import pyaudio
import cv2
class Aud_Vid():

def __init__(self, arg):
    self.video = cv2.VideoCapture(0)
    self.CHUNK = 1470
    self.FORMAT = pyaudio.paInt16
    self.CHANNELS = 2
    self.RATE = 44100
    self.audio = pyaudio.PyAudio()
    self.instream = self.audio.open(format=self.FORMAT,channels=self.CHANNELS,rate=self.RATE,input=True,frames_per_buffer=self.CHUNK)
    self.outstream = self.audio.open(format=self.FORMAT,channels=self.CHANNELS,rate=self.RATE,output=True,frames_per_buffer=self.CHUNK)


def sync(self):
      with concurrent.futures.ThreadPoolExecutor() as executor:
              tv = executor.submit(self.video.read)
              ta = executor.submit(self.instream.read,1470)
              vid = tv.result()
              aud = ta.result()
              return(vid[1].tobytes(),aud)