Filter design and frequency extraction in Python

Question 1

Some sparse comments:

1) On the top picture: I can't comment on what is best between filt and filtfilt, though the shift in frequency of filtfilt is worrying. You can obtain a similar result by applying a low-pass filter to the filt signal.

2) There isn't a "true" instantaneous frequency, unless the signal was specifically generated with a certain tone. In my experience unwrapping the phase of the Hilbert transform does a good job in many cases, though. It becomes less and less reliable as the ratio of noise to signal intensity grows.

3) Regarding the bottom picture, you say that sometimes you need a large bandpass filter. Is this because the signal is very long, and the instantaneous frequency moves around between 500 and 800 Hz? If so, you may want to proceed windowing the signal to a length at which the filtered signal has a distinct peak in the Fourier spectrum, extract that peak, tailor your bandbass filter to that peak, apply Hilbert to the windowed signal, extract the phase, filter the phase.

This is worth doing if you are sure the signal has other harmonics except noise and the one you are interested in, and it takes a while. Before doing so I would want to be sure the data I obtain is wrong.

If it is simply 1 harmonic + noise, I would lowpass+hilbert+extract instantaneous phase + lowpass again on the instantaneous phase

Question 2

To your first problem I can't speak intelligently on but scipy is generally well documented so I'd start reading through some of their stuff.

To your second problem a better designed filter would certainly help. You say the data is "non-stationary," do you know where it will be? Or what kind frequencies it might occupy? For example if the signal is centered around 1 of 3 frequencies that you know a-priori you could have three different filters and run the signal through all 3 (only one giving you the output you want of course).

If you don't know have that kind of knowledge about the signal I would first do a wider BPF, then do some peak detection, and apply a more stringent BPF when you know where the data you would like is located