سؤال

I am looking for a "method" to get a formula, formula which comes from fitting a set of data (3000 point). I was using Legendre polynomial, but for > 20 points it gives not exact values. I can write chi2 test, but algorithm needs a loot of time to calculate N parameters, and at the beginning I don't know how the function looks like, so it takes time. I was thinking about splines... Maybe ...

So the input is: 3000 pints

Output : f(x) = ... something

I want to have a formula from fit. What is a best way to do this in python?

Let the force would be with us! Nykon

هل كانت مفيدة؟

المحلول

How about a polynomial fit:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html

or some other interpolation scheme:

http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html

It is difficult to recommend a suitable method without knowing more about the dataset and something about how good of a fit is required.

نصائح أخرى

Except, a spline does not give you a "formula", at least not unless you have the wherewithal to deal with all of the piecewise segments. Even then, it will not be easily written down, or give you anything that is at all pretty to look at.

A simple spline gives you an interpolant. Worse, for 3000 points, an interpolating spline will give you roughly that many cubic segments! You did say interpolation before. OF course, an interpolating polynomial of that high an order will be complete crapola anyway, so don't think you can just go back there.

If all that you need is a tool that can provide an exact interpolation at any point, and you really don't need to have an explicit formula, then an interpolating spline is a good choice.

Or do you really want an approximant? A function that will APPROXIMATELY fit your data, smoothing out any noise? The fact is, a lot of the time when people who have no idea what they are doing say "interpolation" they really do mean approximation, smoothing. This is possible of course, but there are entire books written on the subject of curve fitting, the modeling of empirical data. You first goal is then to choose an intelligent model, that will represent this data. Best of course is if you have some intelligent choice of model from physical understanding of the relationship under study, then you can estimate the parameters of that model using a nonlinear regression scheme, of which there are many to be found.

If you have no model, and are unwilling to choose one that roughly has the proper shape, then you are left with generic models in the form of splines, which can be fit in a regression sense, or with high order polynomial models, for which I have little respect.

My point in all of this is YOU need to make some choices and do some research on a choice of model.

The only formula would be a polynomial of order 3000.

How good does the fit need to be? What type of formula do you expect?

You could sample your observed points (randomly is best) and fit a cubic spline to this sample (if you repeat this procedure, you can create a distribution of splines). Fitting a spline to 3,000 points is a bit much, but generating a distribution of spline based on a sample could give you an idea of what the function will look like. As Josh mentioned above, http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html is a good place to start your search.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top