Is there some way to speed up recursion by remembering child nodes?

https://stackoverflow.com/questions/23962

09-06-2019
|

Question

For example, Look at the code that calculates the n-th Fibonacci number:

fib(int n)
{
    if(n==0 || n==1)
        return 1;
    return fib(n-1) + fib(n-2);
}

The problem with this code is that it will generate stack overflow error for any number greater than 15 (in most computers).

Assume that we are calculating fib(10). In this process, say fib(5) is calculated a lot of times. Is there some way to store this in memory for fast retrieval and thereby increase the speed of recursion?

I am looking for a generic technique that can be used in almost all problems.

Solution

Yes your insight is correct. This is called dynamic programming. It is usually a common memory runtime trade-off.

In the case of fibo, you don't even need to cache everything :

[edit] The author of the question seems to be looking for a general method to cache rather than a method to compute Fibonacci. Search wikipedia or look at the code of the other poster to get this answer. Those answers are linear in time and memory.

**Here is a linear-time algorithm O(n), constant in memory **

in OCaml:

let rec fibo n = 
    let rec aux = fun
        | 0 -> (1,1)
        | n -> let (cur, prec) = aux (n-1) in (cur+prec, cur)
    let (cur,prec) = aux n in prec;;



in C++:

int fibo(int n) {
    if (n == 0 ) return 1;
    if (n == 1 ) return 1;
    int p = fibo(0);
    int c = fibo(1);
    int buff = 0;
    for (int i=1; i < n; ++i) {
      buff = c;
      c = p+c;
      p = buff;
    };
    return c;
};

This perform in linear time. But log is actually possible !!! Roo's program is linear too, but way slower, and use memory.

Here is the log algorithm O(log(n))

Now for the log-time algorithm (way way way faster), here is a method : If you know u(n), u(n-1), computing u(n+1), u(n) can be done by applying a matrix:

| u(n+1) |  = | 1 1 | | u(n)   |
| u(n)   |    | 1 0 | | u(n-1) |

So that you have :

| u(n)    |  = | 1 1 |^(n-1) | u(1) | = | 1 1 |^(n-1) | 1 |
| u(n-1)  |    | 1 0 |       | u(0) |   | 1 0 |       | 1 |

Computing the exponential of the matrix has a logarithmic complexity. Just implement recursively the idea :

M^(0)    = Id
M^(2p+1) = (M^2p) * M
M^(2p)   = (M^p) * (M^p)  // of course don't compute M^p twice here.

You can also just diagonalize it (not to difficult), you will find the gold number and its conjugate in its eigenvalue, and the result will give you an EXACT mathematical formula for u(n). It contains powers of those eigenvalues, so that the complexity will still be logarithmic.

Fibo is often taken as an example to illustrate Dynamic Programming, but as you see, it is not really pertinent.

@John: I don't think it has anything to do with do with hash.

@John2: A map is a bit general don't you think? For Fibonacci case, all the keys are contiguous so that a vector is appropriate, once again there are much faster ways to compute fibo sequence, see my code sample over there.

OTHER TIPS

This is called memoization and there is a very good article about memoization Matthew Podwysocki posted these days. It uses Fibonacci to exemplify it. And shows the code in C# also. Read it here.

If you're using C#, and can use PostSharp, here's a simple memoization aspect for your code:

[Serializable]
public class MemoizeAttribute : PostSharp.Laos.OnMethodBoundaryAspect, IEqualityComparer<Object[]>
{
    private Dictionary<Object[], Object> _Cache;

    public MemoizeAttribute()
    {
        _Cache = new Dictionary<object[], object>(this);
    }

    public override void OnEntry(PostSharp.Laos.MethodExecutionEventArgs eventArgs)
    {
        Object[] arguments = eventArgs.GetReadOnlyArgumentArray();
        if (_Cache.ContainsKey(arguments))
        {
            eventArgs.ReturnValue = _Cache[arguments];
            eventArgs.FlowBehavior = FlowBehavior.Return;
        }
    }

    public override void OnExit(MethodExecutionEventArgs eventArgs)
    {
        if (eventArgs.Exception != null)
            return;

        _Cache[eventArgs.GetReadOnlyArgumentArray()] = eventArgs.ReturnValue;
    }

    #region IEqualityComparer<object[]> Members

    public bool Equals(object[] x, object[] y)
    {
        if (Object.ReferenceEquals(x, y))
            return true;

        if (x == null || y == null)
            return false;

        if (x.Length != y.Length)
            return false;

        for (Int32 index = 0, len = x.Length; index < len; index++)
            if (Comparer.Default.Compare(x[index], y[index]) != 0)
                return false;

        return true;
    }

    public int GetHashCode(object[] obj)
    {
        Int32 hash = 23;

        foreach (Object o in obj)
        {
            hash *= 37;
            if (o != null)
                hash += o.GetHashCode();
        }

        return hash;
    }

    #endregion
}

Here's a sample Fibonacci implementation using it:

[Memoize]
private Int32 Fibonacci(Int32 n)
{
    if (n <= 1)
        return 1;
    else
        return Fibonacci(n - 2) + Fibonacci(n - 1);
}

Quick and dirty memoization in C++:

Any recursive method type1 foo(type2 bar) { ... } is easily memoized with map<type2, type1> M.

// your original method
int fib(int n)
{
    if(n==0 || n==1)
        return 1;
    return fib(n-1) + fib(n-2);
}

// with memoization
map<int, int> M = map<int, int>();
int fib(int n)
{
    if(n==0 || n==1)
        return 1;

    // only compute the value for fib(n) if we haven't before
    if(M.count(n) == 0)
        M[n] = fib(n-1) + fib(n-2);

    return M[n];
}

EDIT: @Konrad Rudolph
Konrad points out that std::map is not the fastest data structure we could use here. That's true, a vector<something> should be faster than a map<int, something> (though it might require more memory if the inputs to the recursive calls of the function were not consecutive integers like they are in this case), but maps are convenient to use generally.

According to wikipedia Fib(0) should be 0 but it does not matter.

Here is simple C# solution with for cycle:

ulong Fib(int n)
{
  ulong fib = 1;  // value of fib(i)
  ulong fib1 = 1; // value of fib(i-1)
  ulong fib2 = 0; // value of fib(i-2)

  for (int i = 0; i < n; i++)
  {
    fib = fib1 + fib2;
    fib2 = fib1;
    fib1 = fib;
  }

  return fib;
}

It is pretty common trick to convert recursion to tail recursion and then to loop. For more detail see for example this lecture (ppt).

What language is this? It doesnt overflow anything in c... Also, you can try creating a lookup table on the heap, or use a map

caching is generally a good idea for this kind of thing. Since fibonacci numbers are constant, you can cache the result once you have calculated it. A quick c/pseudocode example

class fibstorage {


    bool has-result(int n) { return fibresults.contains(n); }
    int get-result(int n) { return fibresult.find(n).value; }
    void add-result(int n, int v) { fibresults.add(n,v); }

    map<int, int>   fibresults;

}


fib(int n ) {
    if(n==0 || n==1)
            return 1;

    if (fibstorage.has-result(n)) {
        return fibstorage.get-result(n-1);
    }

    return ( (fibstorage.has-result(n-1) ? fibstorage.get-result(n-1) : fib(n-1) ) +
             (fibstorage.has-result(n-2) ? fibstorage.get-result(n-2) : fib(n-2) )
           );
}


calcfib(n) {
    v = fib(n);
    fibstorage.add-result(n,v);
}

This would be quite slow, as every recursion results in 3 lookups, however this should illustrate the general idea

Is this a deliberately chosen example? (eg. an extreme case you're wanting to test)

As it's currently O(1.6^n) i just want to make sure you're just looking for answers on handling the general case of this problem (caching values, etc) and not just accidentally writing poor code :D

Looking at this specific case you could have something along the lines of:

var cache = [];
function fib(n) {
    if (n < 2) return 1;
    if (cache.length > n) return cache[n];
    var result = fib(n - 2) + fib(n - 1);
    cache[n] = result;
    return result;
}

Which degenerates to O(n) in the worst case :D

[Edit: * does not equal + :D ]

[Yet another edit: the Haskell version (because i'm a masochist or something)

fibs = 1:1:(zipWith (+) fibs (tail fibs))
fib n = fibs !! n

]

Try using a map, n is the key and its corresponding Fibonacci number is the value.

@Paul

Thanks for the info. I didn't know that. From the Wikipedia link you mentioned:

This technique of saving values that have already been calculated is called memoization

Yeah I already looked at the code (+1). :)

@ESRogs:

std::map lookup is O(log n) which makes it slow here. Better use a vector.

vector<unsigned int> fib_cache;
fib_cache.push_back(1);
fib_cache.push_back(1);

unsigned int fib(unsigned int n) {
    if (fib_cache.size() <= n)
        fib_cache.push_back(fib(n - 1) + fib(n - 2));

    return fib_cache[n];
}

Others have answered your question well and accurately - you're looking for memoization.

Programming languages with tail call optimization (mostly functional languages) can do certain cases of recursion without stack overflow. It doesn't directly apply to your definition of Fibonacci, though there are tricks..

The phrasing of your question made me think of an interesting idea.. Avoiding stack overflow of a pure recursive function by only storing a subset of the stack frames, and rebuilding when necessary.. Only really useful in a few cases. If your algorithm only conditionally relies on the context as opposed to the return, and/or you're optimizing for memory not speed.

Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:

fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]

That's it. It caches (memoizes) fib[0] and fib[1] off the bat and caches the rest as needed. The rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.

One more excellent resource for C# programmers for recursion, partials, currying, memoization, and their ilk, is Wes Dyer's blog, though he hasn't posted in awhile. He explains memoization well, with solid code examples here: http://blogs.msdn.com/wesdyer/archive/2007/01/26/function-memoization.aspx

The problem with this code is that it will generate stack overflow error for any number greater than 15 (in most computers).

Really? What computer are you using? It's taking a long time at 44, but the stack is not overflowing. In fact, your going to get a value bigger than an integer can hold (~4 billion unsigned, ~2 billion signed) before the stack is going to over flow (Fibbonaci(46)).

This would work for what you want to do though (runs wiked fast)

class Program
{
    public static readonly Dictionary<int,int> Items = new Dictionary<int,int>();
    static void Main(string[] args)
    {
        Console.WriteLine(Fibbonacci(46).ToString());
        Console.ReadLine();
    }

    public static int Fibbonacci(int number)
    {
        if (number == 1 || number == 0)
        {
            return 1;
        }

        var minus2 = number - 2;
        var minus1 = number - 1;

        if (!Items.ContainsKey(minus2))
        {
            Items.Add(minus2, Fibbonacci(minus2));
        }

        if (!Items.ContainsKey(minus1))
        {
            Items.Add(minus1, Fibbonacci(minus1));
        }

        return (Items[minus2] + Items[minus1]);
    }
}

If you're using a language with first-class functions like Scheme, you can add memoization without changing the initial algorithm:

(define (memoize fn)
  (letrec ((get (lambda (query) '(#f)))
           (set (lambda (query value)
                  (let ((old-get get))
                    (set! get (lambda (q)
                                (if (equal? q query)
                                    (cons #t value)
                                    (old-get q))))))))
    (lambda args
      (let ((val (get args)))
        (if (car val)
            (cdr val)
            (let ((ret (apply fn args)))
              (set args ret)
              ret))))))


(define fib (memoize (lambda (x)
                       (if (< x 2) x
                           (+ (fib (- x 1)) (fib (- x 2)))))))

The first block provides a memoization facility and the second block is the fibonacci sequence using that facility. This now has an O(n) runtime (as opposed to O(2^n) for the algorithm without memoization).

Note: the memoization facility provided uses a chain of closures to look for previous invocations. At worst case this can be O(n). In this case, however, the desired values are always at the top of the chain, ensuring O(1) lookup.

As other posters have indicated, memoization is a standard way to trade memory for speed, here is some pseudo code to implement memoization for any function (provided the function has no side effects):

Initial function code:

 function (parameters)
      body (with recursive calls to calculate result)
      return result

This should be transformed to

 function (parameters)
      key = serialized parameters to string
      if (cache[key] does not exist)  {
           body (with recursive calls to calculate result)
           cache[key] = result
      }
      return cache[key]

By the way Perl has a memoize module that does this for any function in your code that you specify.

# Compute Fibonacci numbers
sub fib {
      my $n = shift;
      return $n if $n < 2;
      fib($n-1) + fib($n-2);
}

In order to memoize this function all you do is start your program with

use Memoize;
memoize('fib');
# Rest of the fib function just like the original version.
# Now fib is automagically much faster ;-)

@lassevk:

This is awesome, exactly what I had been thinking about in my head after reading about memoization in Higher Order Perl. Two things which I think would be useful additions:

An optional parameter to specify a static or member method that is used for generating the key to the cache.
An optional way to change the cache object so that you could use a disk or database backed cache.

Not sure how to do this sort of thing with Attributes (or if they are even possible with this sort of implementation) but I plan to try and figure out.

(Off topic: I was trying to post this as a comment, but I didn't realize that comments have such a short allowed length so this doesn't really fit as an 'answer')

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow