Pregunta

I am trying to write an audio fingerprinting library for educational purpose. Its based on Computer Vision for Music Identification . I have a couple of questions relating to the contents of the paper.

  1. I know that two bytes represents a sample, so i wrote this class to extract the samples from a pcm file. I'd like to know if this is right (sorry if its too obvious :) ).

    class FingerPrint:
    
       def __init__(self, pcmFile):
          self.pcmFile = pcmFile
          self.samples = []
          self.init()
    
    
       def init(self):
          # Current samples
          currentSamples = []
    
          # Read pcm file
          with open(self.pcmFile, 'rb') as f:
             byte = f.read(2)
             while byte != '':
               self.samples.append(byte)
               byte = f.read(2)
    
    fp = FingerPrint('output.pcm')
    
  2. If the above code is ok, then according to the book the next thing to do is to convolve the signal with a low pass filter and take every 8th sample. I don't understand these and why this has to be done, it would be awesome if someone could help me understand (with codes if possible)

¿Fue útil?

Solución

After read the two bytes, you need to convert it into int. You can use struct module.

But I think you should use NumPy, SciPy:

To read wave file, you can call scipy.io.wavfile.read()

http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/io.html#module-scipy.io.wavfile

If your file is raw PCM data, you can call numpy.fromfile()

http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html

for example:

data = numpy.fromfile("test.pcm", dtype=np.int16)

To design lowpass filter, you can use filter design functions in scipy.signal:

http://docs.scipy.org/doc/scipy-0.10.1/reference/signal.html#filter-design

To do the convolve, you can use convoliving functions in scipy.signal:

http://docs.scipy.org/doc/scipy-0.10.1/reference/signal.html#convolution

There is also a convolve function in numpy:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html

Otros consejos

It sounds like the algorithm you're using is doing a filter-and-decimate operation to reduce the sample rate of the data by a factor of 8. This results in fewer samples being fed to other downstream functions that may be computationally expensive but which do not need the full bandwidth of the input data. The convolution function you reference is performing the low pass filtering of the input data using the impulse filter response corresponding to the desired filter shape. These are standard signal processing operations which you should be able to read up on in any text on digital signal processing.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top