Invertierbar STFT und ISTFT in Python

https://stackoverflow.com/questions/2459295

20-09-2019
|

Frage

Gibt es eine Allgemeine-Zweck form short-time-Fourier-Transformation mit entsprechende inverse Transformation gebaut in SciPy oder NumPy oder was auch immer?

Es gibt die pyplot specgram Funktion in matplotlib, die fordert, ax.specgram(), die aufruft mlab.specgram(), die aufruft _spectral_helper():

#The checks for if y is x are so that we can use the same function to
#implement the core of psd(), csd(), and spectrogram() without doing
#extra calculations.  We return the unaveraged Pxy, freqs, and t.

aber

Dies ist eine Hilfsfunktion, die die Gemeinsamkeiten zwischen den 204 #psd -, csd-und Spektrogramm.Es ist NICHT dazu genutzt werden, um außerhalb von mlab

Ich bin mir nicht sicher, ob dies verwendet werden kann, um eine STFT und ISTFT, obwohl.Gibt es sonst noch etwas, oder sollte ich etwas übersetzen, wie diese MATLAB-Funktionen?

Ich weiß, wie ich Schreibe meine eigene ad-hoc-Umsetzung;Ich bin gerade auf der Suche nach etwas, full-featured, die mit verschiedenen windowing-Funktionen (aber hat einen vernünftigen Standard), ist vollständig invertierbar mit COLA windows - (istft(stft(x))==x), getestet, indem mehrere Leute, wird keine " off-by-one-Fehler, Griffe enden und keine Polsterung gut, schnell RFFT Umsetzung für die Echtzeit-input, etc.

Lösung

Ich bin ein wenig zu spät kommt, aber realisiert scipy hat eingebaute ISTFT Funktion ab 0.19.0

Andere Tipps

Hier ist mein Python-code, vereinfachte für diese Antwort:

import scipy, pylab

def stft(x, fs, framesz, hop):
    framesamp = int(framesz*fs)
    hopsamp = int(hop*fs)
    w = scipy.hanning(framesamp)
    X = scipy.array([scipy.fft(w*x[i:i+framesamp]) 
                     for i in range(0, len(x)-framesamp, hopsamp)])
    return X

def istft(X, fs, T, hop):
    x = scipy.zeros(T*fs)
    framesamp = X.shape[1]
    hopsamp = int(hop*fs)
    for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
        x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
    return x

Hinweise:

Die list comprehension ist ein kleiner trick, den ich verwenden möchte, um zu simulieren-block-Verarbeitung von Signalen in numpy/scipy.Es ist wie blkproc in Matlab.Statt einer for loop habe ich mit einem Befehl (z.B., fft), um jeden frame des Signals in einer Liste Verständnis, und dann scipy.array wirft es in ein 2D-array.Ich benutze dies, um Spektrogramme, chromagrams, MFCC-Gramm, und vieles mehr.
Für dieses Beispiel verwende ich eine naive overlap-and-add-Methode in istft.Um zu rekonstruieren, wie die ursprüngliche signal, die Summe der sequentiellen Fenster-Funktionen müssen konstant sein, vorzugsweise gleich der Einheit (1.0).In diesem Fall habe ich entschieden Hann (oder hanning) Fenster und ein 50% überlappung, die einwandfrei funktioniert.Finden diese Diskussion weitere Informationen.
Es gibt wahrscheinlich mehr prinzipielle Möglichkeiten zur Berechnung der ISTFT.Dieses Beispiel ist vor allem dazu gedacht, um mit Bildung.

Ein test:

if __name__ == '__main__':
    f0 = 440         # Compute the STFT of a 440 Hz sinusoid
    fs = 8000        # sampled at 8 kHz
    T = 5            # lasting 5 seconds
    framesz = 0.050  # with a frame size of 50 milliseconds
    hop = 0.025      # and hop size of 25 milliseconds.

    # Create test signal and STFT.
    t = scipy.linspace(0, T, T*fs, endpoint=False)
    x = scipy.sin(2*scipy.pi*f0*t)
    X = stft(x, fs, framesz, hop)

    # Plot the magnitude spectrogram.
    pylab.figure()
    pylab.imshow(scipy.absolute(X.T), origin='lower', aspect='auto',
                 interpolation='nearest')
    pylab.xlabel('Time')
    pylab.ylabel('Frequency')
    pylab.show()

    # Compute the ISTFT.
    xhat = istft(X, fs, T, hop)

    # Plot the input and output signals over 0.1 seconds.
    T1 = int(0.1*fs)

    pylab.figure()
    pylab.plot(t[:T1], x[:T1], t[:T1], xhat[:T1])
    pylab.xlabel('Time (seconds)')

    pylab.figure()
    pylab.plot(t[-T1:], x[-T1:], t[-T1:], xhat[-T1:])
    pylab.xlabel('Time (seconds)')

STFT of 440 Hz sinusoid ISTFT of beginning of 440 Hz sinusoid ISTFT of end of 440 Hz sinusoid

Hier ist der STFT-Code, die ich benutze. STFT + ISTFT hier gibt perfekte Rekonstruktion (auch für die erste Frames). Ich etwas hier den Code von Steve Tjoa gegeben geändert:. Hier die Größe des rekonstruierten Signals ist das gleiche wie die des Eingangssignals

import scipy, numpy as np

def stft(x, fftsize=1024, overlap=4):   
    hop = fftsize / overlap
    w = scipy.hanning(fftsize+1)[:-1]      # better reconstruction with this trick +1)[:-1]  
    return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])

def istft(X, overlap=4):   
    fftsize=(X.shape[1]-1)*2
    hop = fftsize / overlap
    w = scipy.hanning(fftsize+1)[:-1]
    x = scipy.zeros(X.shape[0]*hop)
    wsum = scipy.zeros(X.shape[0]*hop) 
    for n,i in enumerate(range(0, len(x)-fftsize, hop)): 
        x[i:i+fftsize] += scipy.real(np.fft.irfft(X[n])) * w   # overlap-add
        wsum[i:i+fftsize] += w ** 2.
    pos = wsum != 0
    x[pos] /= wsum[pos]
    return x

librosa.core.stft und

Gefunden eine andere STFT, aber keine entsprechende inverse Funktion:

http://code.google.com /p/pytfd/source/browse/trunk/pytfd/stft.py

def stft(x, w, L=None):
    ...
    return X_stft

w ist eine Fensterfunktion als Array
L ist die Überlappung, in Proben

Keiner der oben genannten Antworten hat gut funktioniert OOTB für mich. So modifizierte ich Steve Tjoa ist.

import scipy, pylab
import numpy as np

def stft(x, fs, framesz, hop):
    """
     x - signal
     fs - sample rate
     framesz - frame size
     hop - hop size (frame size = overlap + hop size)
    """
    framesamp = int(framesz*fs)
    hopsamp = int(hop*fs)
    w = scipy.hamming(framesamp)
    X = scipy.array([scipy.fft(w*x[i:i+framesamp]) 
                     for i in range(0, len(x)-framesamp, hopsamp)])
    return X

def istft(X, fs, T, hop):
    """ T - signal length """
    length = T*fs
    x = scipy.zeros(T*fs)
    framesamp = X.shape[1]
    hopsamp = int(hop*fs)
    for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
        x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
    # calculate the inverse envelope to scale results at the ends.
    env = scipy.zeros(T*fs)
    w = scipy.hamming(framesamp)
    for i in range(0, len(x)-framesamp, hopsamp):
        env[i:i+framesamp] += w
    env[-(length%hopsamp):] += w[-(length%hopsamp):]
    env = np.maximum(env, .01)
    return x/env # right side is still a little messed up...

Ich fand auch dieses auf GitHub, aber es scheint auf Pipelines statt normaler Arrays zu arbeiten:

http://github.com/ronw/frontend/blob /master/basic.py#LID281

def STFT(nfft, nwin=None, nhop=None, winfun=np.hanning):
    ...
    return dataprocessor.Pipeline(Framer(nwin, nhop), Window(winfun),
                                  RFFT(nfft))


def ISTFT(nfft, nwin=None, nhop=None, winfun=np.hanning):
    ...
    return dataprocessor.Pipeline(IRFFT(nfft), Window(winfun),
                                  OverlapAdd(nwin, nhop))

Ich denke, scipy.signal hat, was Sie suchen. Es hat vernünftige Standardwerte, unterstützt mehrere Fenstertypen, etc ...

http: // docs. scipy.org/doc/scipy-0.17.0/reference/generated/scipy.signal.spectrogram.html

from scipy.signal import spectrogram
freq, time, Spec = spectrogram(signal)

Eine feste Version von basj Antwort.

import scipy, numpy as np
import matplotlib.pyplot as plt

def stft(x, fftsize=1024, overlap=4):
    hop=fftsize//overlap
    w = scipy.hanning(fftsize+1)[:-1]      # better reconstruction with this trick +1)[:-1]  
    return np.vstack([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])

def istft(X, overlap=4):   
    fftsize=(X.shape[1]-1)*2
    hop=fftsize//overlap
    w=scipy.hanning(fftsize+1)[:-1]
    rcs=int(np.ceil(float(X.shape[0])/float(overlap)))*fftsize
    print(rcs)
    x=np.zeros(rcs)
    wsum=np.zeros(rcs)
    for n,i in zip(X,range(0,len(X)*hop,hop)): 
        l=len(x[i:i+fftsize])
        x[i:i+fftsize] += np.fft.irfft(n).real[:l]   # overlap-add
        wsum[i:i+fftsize] += w[:l]
    pos = wsum != 0
    x[pos] /= wsum[pos]
    return x

a=np.random.random((65536))
b=istft(stft(a))
plt.plot(range(len(a)),a,range(len(b)),b)
plt.show()

Wenn Sie Zugriff auf eine C binäre Bibliothek, das tut, was Sie wollen, dann verwenden Sie http: // code.google.com/p/ctypesgen/ eine Python-Schnittstelle zu dieser Bibliothek erzeugen.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow