Question

i try to fit this plot enter image description here as you cans see the fit is not so good for the data.

My code is:

    clear
reset


set terminal pngcairo size 1000,600 enhanced font 'Verdana,10'
set output 'LocalEnergyStepZoom.png'
set ylabel '{/Symbol D}H/H_0'
set xlabel 'n_{step}'
set format y '%.2e'

set xrange [*:*]
set yrange [1e-16:*]

f(x) = a*x**b
fit f(x) "revErrEnergyGfortCaotic.txt" via a,b


set logscale

plot 'revErrEnergyGfortCaotic.txt' w p,\
 'revErrEnergyGfortRegular.txt' w p,\
f(x) w l lc rgb "black" lw 3 

exit

So the question is how mistake i compute here? because i suppose that in a log-log plane a fit of the form i put in the code should rappresent very well the data.

Thanks a lot

Finally i can be able to solve the problem using the suggestion in the answer of Christop and modify it just a bit.

I found the approximate slop of the function (something near to -4) then taking this parameter fix i just fit the curve with only a, found it i fix it and modify only b. After that using the output as starting solution for the fit i found the best fit. enter image description here

Was it helpful?

Solution

You must find appropriate starting values to get a correct fit, because that kind of fitting doesn't have one global solution. If you don't define a and b, both are set to 1 which might be too far away. Try using

a = 100
b = -3

for a better start. Maybe you need to tweak those value a bit more, I couldn't because I don't have the data file.

Also, you might want to restrict the region of the fitting to the part above 10:

fit [10:] f(x) "revErrEnergyGfortCaotic.txt" via a,b

Of course only, if it is appropriate.

OTHER TIPS

This is a common issue in data analysis, and I'm not certain if there's a nice Gnuplot way to solve it.

The issue is that the penalty functions in standard fitting routines are typically the sum of squares of errors, and try as you might, if your data have a lot of dynamic range, the errors for the smallest y-values come out to essentially zero from the point of view of the algorithm.

I recently taught a course to students where they needed to fit such data. Lots of them beat their (matlab) fitting routines into submission by choosing very stringent convergence criteria, but even this did not help too much.

What you really need to do, if you want to fit this power-law tail well, is to convert the data into log-log form and run a linear regression on that log-log representation.

The main problem here is that the residual errors of the function values of the higher x are very small compared to the residuals at lower x values. After all, you almost span 20 orders of magnitude on the y axis.

Just weight the y values with 1/y**2, or even better: if you have the standard deviations of your data points weight the values with 1/std**2. Then the fit should converge much much better.

In gnuplot weighting is done using a third data column:

fit f(x) 'data' using 1:2:(1/$2**2") via ... 

Or you can use Raman Shah's advice and linearize the y axis and do a linear regression.

you need to use weights for your fit (currently low values are not considered as important) and have a better starting guess (via "pars_file.pars")

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top