Frage

There are a lot of related questions here on SO, but they all ask about writing a program to compute the complexity of arbitrary algorithms (which is obviously undecideable). I am willing to make the following restrictions on the input:

  1. The algorithm terminates
  2. The algorithm is purely functional

The question is, can a program be written to compute the time complexity of such an algorithm through static analysis? If the input algorithm does not terminate, the program behaviour is undefined (it may crash, return a lie, or fail to terminate).

War es hilfreich?

Andere Tipps

You can't be 100% sure you're getting a correct answer from any technique to estimate the complexity based on real-world running time. This is because the exact running time can involve a really complex function, meaning the running time can theoretically follow any other function while the input size is below some really really big number. The running time only needs to tend towards (some multiple of) the complexity function as the input size tends towards infinity. This assumes you want to find a tight bound (which exists for many, but not all, algorithms) and not just an upper or lower bound.

But you can come up with some reasonable estimate of the complexity that should generally be quite accurate.

Also note that a number of algorithms have different running times for different inputs of the same size. You could try running the below for a few different inputs of the same size and averaging the result to mitigate this. This would also help mitigate system conditions that may affect the running time. Although you may not be able to estimate the complexity of the worst and best cases if you don't know the specific input to use for those (as they may be too rare for you to get them while passing in random data).

How to do this:

Record the time for some sufficiently large and sufficiently different sized inputs (e.g. you can run it for inputs of sizes equal to different powers of 10, like 100, 1000 and 10000, and these should be big enough for it to run for at least a few seconds to make the data less noisy). Let's use 3 input sizes. You strictly speaking only need 2 input sizes, but you can use 3 or more as additional verification.

Now we can try to map these 3 results we got to one of some set of complexities like O(1), O(log(n)), O(sqrt(n)), O(n), O(n log n), O(n2), O(n3), etc.

If you're trying to match it manually, you could put the running times you got on a graph along with the graphs of each of the above function (scaled appropriately) and see which one matches best.

If you're trying to automate it, you can try to map each of the functions to the input size and see how close it matches.

There are better ways to do this, but one really simple way to do it would be as follows:

Say you have these running times:

input size   running time
100          21 seconds
1000         29 seconds
10000        40 seconds

Now you can try to match one of those (say the biggest one, which is likely to be the most accurate) to a multiple of one of the above functions.

O(n):     k x n     = k x 10000     = 40,    k = 40 / 10000     = 0.004
O(log n): k x log n = k x log 10000 = 40,    k = 40 / log 10000 = 10
O(n²):    k x n²    = k x 10000²    = 40,    k = 40 / 10000²    = 0.0000004

Now compare what the equation gives you to what the actual running time is for the other input sizes:

For n = 1000, actual running time = 29 seconds
O(n):     0.004 x 1000      = 4 seconds
O(log n): 10 x log 1000     = 30 seconds
O(n²):    0.0000004 x 1000² = 0.4 seconds

For n = 100, actual running time = 21 seconds
O(n):     0.004 x 100      = 0.4 seconds
O(log n): 10 x log 100     = 20 seconds
O(n²):    0.0000004 x 100² = 0.004 seconds

Looking at this, we can clearly see that O(log n) is the closest, with the actual and predicted running times differing by only 1 second in both cases. So that would be our guess for the complexity.

Given the restriction that the algorithm stops it's possible. Execute the algorithm for each possible input and measure execution time. Next, choose a function as a possible upper boundary and test it for each of the results. If not good enough, increase the boundary and retest. Repeat till the boundary is good enough.

Edit: This solution assumes boundaries of a real computer program, i.e. the quantity of different inputs isn't infinite. Otherwise, it's not possible to compute the complexity of a general algorithm. Consider the algorithm for which the complexity is O(n) = nO(n-1). Since the input is infinite, you won't be able to find any function f that can bound the complexity.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top