Question

I've been struggling for a few weeks on a phase vocoder. The ultimate goal is achieving time stretching of a signal. I've been making a lot of progress, but I still have two issues to solve.

Issue1: Do I need a synthesis window?.
I take overlapping frames from the input signal (a sine wave) with any hop size (e.g. N/2, N = samples per frame). I apply a Hanning window to the frame and feed the result to FFT. To achieve time-stretching I perform iFFT and overlap-add the output frames using a different hop size than the one used during analysis.
The problem is that with an output hop factor = 0.5 (hop size = N/2) the output is smooth, but for greater hop-sizes I can hear 'vibrations'. The image shows the output of 8 frames with a hop factor = 1 (zero overlap). It is evident why the sound is vibrating. For small hop sizes the frames overlap much more and the sound is smoother. I've read a lot about phase vocoding, but I don't seem to get how to get a smooth output for large hop sizes. What am I missing?

enter image description here

Issue2: Phase-correction.
Currently the output sounds worse with phase correction but I'll leave that for another post.

Thanks in advance for taking the time.

Était-ce utile?

La solution

I'm an amateur at this, but wouldn't you get a better result if you started with a much bigger overlap, e.g. a "hop size" of N/10 or something like that? Then you'd have more freedom to adjust it on output while still keeping a substantial overlap.

Also, it might pay to adjust the steepness of the window depending on how much you're expanding/compressing time.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top