Pregunta

this is my first post on stack. So far this site has been very helpful, but I am a novice and need a clear explanation to my problem, which is related to pitch-shifting audio in Python. I have the current modules installed: numpy, scipy, pygame, and the scikits "samplerate" api.

My goal is to take a stereo file and play it back at a different pitch in as few steps as possible. Currently, I load the file into an array using pygame.sndarray, then apply a samplerate conversion using scikits.samplerate.resample, then convert the output back to a sound object for playback using pygame. The problem is garbage audio comes out of my speakers. Surely I'm missing a few steps (in addition to not knowing anything about math and audio).

Thanks.

import time, numpy, pygame.mixer, pygame.sndarray
from scikits.samplerate import resample

pygame.mixer.init(44100,-16,2,4096)

# choose a file and make a sound object
sound_file = "tone.wav"
sound = pygame.mixer.Sound(sound_file)

# load the sound into an array
snd_array = pygame.sndarray.array(sound)

# resample. args: (target array, ratio, mode), outputs ratio * target array.
# this outputs a bunch of garbage and I don't know why.
snd_resample = resample(snd_array, 1.5, "sinc_fastest")

# take the resampled array, make it an object and stop playing after 2 seconds.
snd_out = pygame.sndarray.make_sound(snd_resample)
snd_out.play()
time.sleep(2)
¿Fue útil?

Solución

Your problem is that pygame works with numpy.int16 arrays but the call to resample return a numpy.float32 array:

>>> snd_array.dtype
dtype('int16')
>>> snd_resample.dtype
dtype('float32')

You can convert resample result to numpy.int16 using astype:

>>> snd_resample = resample(snd_array, 1.5, "sinc_fastest").astype(snd_array.dtype)

With this modification, your python script plays the tone.wav file nicely, at a lower pitch and a lower speed.

Otros consejos

Your best bet is probably using python audiere.

Here is a link, I used it to do the same sort of thing, it's very easy, just read all the documentation.

http://audiere.sourceforge.net/home.php

Most likely the scikits.samplerate.resample is "thinking" your audio is in another format than 16bit stereo. Check the documentation on scikits.samplerate on where to select the proper audio format in your array - If it resampled 16 bit audio treating it as 8 bit garbage is what would come out.

From the scikits.samplerate.resample documentation:

If input has rank 1, than all data are used, and are assumed to be from a mono signal. If rank is 2, the number columns will be assumed to be the number of channels.

So I think what you need to do is something like this to pass the stereo data to resample in the format it expects:

snd_array = snd_array.reshape((-1,2))

snd_resample = resample(snd_array, 1.5, "sinc_fastest")

snd_resample = snd_resample.reshape(-1) # Flatten it out again
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top