Why does scipy.optimize.curve_fit produce parameters which are barely different from the guess?

Question 1

The problem you're facing is actually not due to curve_fit (or leastsq). It is due to the landscape of the objective of your optimisation problem. In your case the objective is the sum of residuals' squares, which you are trying to minimise. Now, if you look closely at your objective in a close surrounding of your initial conditions, for example using the code below, which only focuses on the first parameter:

p_ind = 0
eps = 1e-6
n_points = 100

frac_surroundings = np.linspace(p_guess[p_ind] - eps, p_guess[p_ind] + eps, n_points)

obj = []
temp_guess = p_guess.copy()
for p in frac_surroundings:
    temp_guess[0] = p
    obj.append(((grating_hist(x, *p_data) - grating_hist(x, *temp_guess))**2.0).sum())

py.plot(frac_surroundings, obj)
py.show()

you will notice that the landscape is piecewise constant (you can easily check that the situation is the same for other parameters. The problem with that is that these pieces are of the order of 10^-6, whereas the initial step of the fitting procedure is somewhere around 10^-8, hence the procedure ends quickly concluding that you cannot improve from the given initial condition. You could try to fix it by changing epsfcn parameter in curve_fit, but you would quickly notice that the landscape, on top of being piecewise constant, is also very "rugged". In other words, curve_fit is simply not well suited for such a problem, which is simply difficult for gradient based methods, as it is highly non-convex. Probably, some stochastic optimisation methods could do a better job. That is, however, a different question/problem.

Question 2

I think it is a local minimum, or the algorith fails for a non trivial reason. It is far easier to fit the data to the input, instead of fitting the statistical description of the data to the statistical description of the input.

Here's a modified version of the code doing so:

z = np.linspace(0,1,20000,endpoint=True)

def grating_hist_indicator(x,frac,xmax,x0):
  #  model data to be turned into a histogram
  dx = x[1]-x[0]
  grating = np.cos(frac*np.pi*z)
  norm_grating = xmax*(grating-grating[-1])/(1-grating[-1])+x0
  return norm_grating

x = np.linspace(0,5,512)
p_data = [0.7,1.1,0.8]
pct = grating_hist(x,*p_data)

pct_indicator = grating_hist_indicator(x,*p_data)
p_guess = [1,1,1]
p_fit,pcov = curve_fit(grating_hist_indicator,x,pct_indicator,p0=p_guess)

plot(x,pct,label='Data')
plot(x,grating_hist(x,*p_fit),label='Fit')
legend()
show()