Question

I am trying to do histogram matching of simulated data to observed precipitation data. The below shows a simple simulated case. I got the CDF of both the simulated and observed data and got stuck theree. I hope a clue would help me to get across..Thanks you in advance

import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
import scipy.stats as st


sim = st.gamma(1,loc=0,scale=0.8) # Simulated
obs = st.gamma(2,loc=0,scale=0.7) # Observed
x = np.linspace(0,4,1000)
simpdf = sim.pdf(x)
obspdf = obs.pdf(x)
plt.plot(x,simpdf,label='Simulated')
plt.plot(x,obspdf,'r--',label='Observed')
plt.title('PDF of Observed and Simulated Precipitation')
plt.legend(loc='best')
plt.show()

plt.figure(1)
simcdf = sim.cdf(x)
obscdf = obs.cdf(x)
plt.plot(x,simcdf,label='Simulated')
plt.plot(x,obscdf,'r--',label='Observed')
plt.title('CDF of Observed and Simulated Precipitation')
plt.legend(loc='best')
plt.show()

# Inverse CDF
invcdf = interp1d(obscdf,x)
transfer_func = invcdf(simcdf)

plt.figure(2)
plt.plot(transfer_func,x,'g-')
plt.show()
Was it helpful?

Solution

I tried to reproduce your code, and got the following error:

ValueError: A value in x_new is above the interpolation range.

If you look at the plot of your two CDFs it is pretty straight forward to figure out what is going on:

enter image description here

When you now define invcdf = interp1d(obscdf, x), notice that obscdf ranges from

>>> obscdf[0]
0.0
>>> obscdf[-1]
0.977852889924409

and so invcdf can only interpolate values between those limits: beyond them we would have to do extrapolation, which is not all that well defined. SciPy's default behavior is to raise an error when asked to extrapolate. Which is exactly what happens when you ask for invcdf(simcdf), because

>>> simcdf[-1]
0.99326205300091452

is beyond the interpolation range.

If you read the interp1d docs you will see that this behavior can be modified doing

invcdf = interp1d(obscdf, x, bounds_error=False)

and now everything works out fine, although you need to reverse the order of your plotting arguments to plt.plot(x, transfer_func,'g-') to get the same as in the figure you posted:

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top