Frage

If I have a function f(x) = y that I don't know the form of, and if I have a long list of x and y value pairs (potentially thousands of them), is there a program/package/library that will generate potential forms of f(x)?

Obviously there's a lot of ambiguity to the possible forms of any f(x), so something that produces many non-trivial unique answers (in reduced terms) would be ideal, but something that could produce at least one answer would also be good.

If x and y are derived from observational data (i.e. experimental results), are there programs that can create approximate forms of f(x)? On the other hand, if you know beforehand that there is a completely deterministic relationship between x and y (as in the input and output of a pseudo random number generator) are there programs than can create exact forms of f(x)?

War es hilfreich?

Lösung

Soooo, I found the answer to my own question. Cornell has released a piece of software for doing exactly this kind of blind fitting called Eureqa. It has to be one of the most polished pieces of software that I've ever seen come out of an academic lab. It's seriously pretty nifty. Check it out:

enter image description here

It's even got turnkey integration with Amazon's ec2 clusters, so you can offload some of the heavy computational lifting from your local computer onto the cloud at the push of a button for a very reasonable fee.

I think that I'm going to have to learn more about GUI programming so that I can steal its interface.

Andere Tipps

(This is more of a numerical methods question.) If there is some kind of observable pattern (you can kinda see the function), then yes, there are several ways you can approximate the original function, but they'll be just that, approximations.

What you want to do is called interpolation. Two very simple (and not very good) methods are Newton's method and Laplace's method of interpolation. They both work on the same principle but they are implemented differently (Laplace's is iterative, Newton's is recursive, for one).

If there's not much going on between any two of your data points (ie, the actual function doesn't have any "bumps" whose "peaks" are not represented by one of your data points), then the spline method of interpolation is one of the best choices you can make. It's a bit harder to implement, but it produces nice results.

Edit: Sometimes, depending on your specific problem, these methods above might be overkill. Sometimes, you'll find that linear interpolation (where you just connect points with straight lines) is a perfectly good solution to your problem.

It depends.

If you're using data acquired from the real-world, then statistical regression techniques can provide you with some tools to evaluate the best fit; if you have several hypothesis for the form of the function, you can use statistical regression to discover the "best" fit, though you may need to be careful about over-fitting a curve -- sometimes the best fit (highest correlation) for a specific dataset completely fails to work for future observations.

If, on the other hand, the data was generated something synthetically (say, you know they were generated by a polynomial), then you can use polynomial curve fitting methods that will give you the exact answer you need.

Yes, there are such things.

If you plot the values and see that there's some functional relationship that makes sense, you can use least squares fitting to calculate the parameter values that minimize the error.

If you don't know what the function should look like, you can use simple spline or interpolation schemes.

You can also use software to guess what the function should be. Maybe something like Maxima can help.

Wolfram Alpha can help you guess:

http://blog.wolframalpha.com/2011/05/17/plotting-functions-and-graphs-in-wolframalpha/

Polynomial Interpolation is the way to go if you have a totally random set

http://en.wikipedia.org/wiki/Polynomial_interpolation

If your set is nearly linear, then regression will give you a good approximation.

Creating exact form from the X's and Y's is mostly impossible.

Notice that what you are trying to achieve is at the heart of many Machine Learning algorithm and therefor you might find what you are looking for on some specialized libraries.

A list of x/y values N items long can always be generated by an degree-N polynomial (assuming no x values are the same). See this article for more details:

http://en.wikipedia.org/wiki/Polynomial_interpolation

Some lists may also match other function types, such as exponential, sinusoidal, and many others. It is impossible to find the 'simplest' matching function, but the best you can do is go through a list of common ones like exponential, sinusoidal, etc. and if none of them match, interpolate the polynomial.

I'm not aware of any software that can do this for you, though.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top