Fit log-log data with gnuplot

Question 1

You must find appropriate starting values to get a correct fit, because that kind of fitting doesn't have one global solution. If you don't define a and b, both are set to 1 which might be too far away. Try using

a = 100
b = -3

for a better start. Maybe you need to tweak those value a bit more, I couldn't because I don't have the data file.

Also, you might want to restrict the region of the fitting to the part above 10:

fit [10:] f(x) "revErrEnergyGfortCaotic.txt" via a,b

Of course only, if it is appropriate.

Question 2

This is a common issue in data analysis, and I'm not certain if there's a nice Gnuplot way to solve it.

The issue is that the penalty functions in standard fitting routines are typically the sum of squares of errors, and try as you might, if your data have a lot of dynamic range, the errors for the smallest y-values come out to essentially zero from the point of view of the algorithm.

I recently taught a course to students where they needed to fit such data. Lots of them beat their (matlab) fitting routines into submission by choosing very stringent convergence criteria, but even this did not help too much.

What you really need to do, if you want to fit this power-law tail well, is to convert the data into log-log form and run a linear regression on that log-log representation.

Question 3

The main problem here is that the residual errors of the function values of the higher x are very small compared to the residuals at lower x values. After all, you almost span 20 orders of magnitude on the y axis.

Just weight the y values with 1/y**2, or even better: if you have the standard deviations of your data points weight the values with 1/std**2. Then the fit should converge much much better.

In gnuplot weighting is done using a third data column:

fit f(x) 'data' using 1:2:(1/$2**2") via ...

Or you can use Raman Shah's advice and linearize the y axis and do a linear regression.

Question 4

you need to use weights for your fit (currently low values are not considered as important) and have a better starting guess (via "pars_file.pars")