Calculate the number of representations of a number as a sum of fibonacci numbers

Question 1

Let's prove some stuff about numbers:

Lemma 1: Let n ≥ 1 be an integer and Fib(i) be the largest Fibonacci number with Fib(i) ≤ n. Then in a representation of n as a sum of Fibonacci numbers with distinct indices, either Fib(i) or Fib(i - 1) appears, but not both.

Proof: We can show by induction that the sum Fib(1) + Fib(2) + ... + Fib(i - 2) = Fib(i) - 1. Since Fib(i) < n, we need at least Fib(i - 1) or Fib(i) in the representation. But not both, since Fib(i) + Fib(i - 1) = Fib(i + 1) > n (otherwise Fib(i) would not be the maximal Fibonacci number smaller than or equal to n).

Lemma 2: n - Fib(i) < Fib(i - 1) and n - Fib(i - 1) < Fib(i).

Proof: Those are easy to show. Left as an exercise to the reader.

I initially thought that this gives rise to the recurrence F(n) = F(n - Fib(i)) + F(n - Fib(i - 1)), but there is a catch: It could be that n - Fib(i - 1) ≥ Fib(i - 1), so in that case it can happen that F(i - 1) is reused, which we disallowed. We can fix that quite easily, though: We can just give the function F an additional boolean flag that tells it to disallow the recursion into F(n - Fib(i)).

One last problem remains: How to compute i? One important observation is that Fibonacci numbers grow exponentially, so we have i = O(log n). We can just use brute force to find it (compute all Fibonacci numbers up to n:

function F(n : Integer, recurseHigh = True: Bool):
    if n == 0: return 1
    a, b = 1, 1
    while a + b <= n:
        a, b = b, a + b
    res = 0
    if recurseHigh: res += F(n - b)
    res += F(n - a, n - a < a)
    return res

It happens to work fast enough even with this "stupid" implementation for 32-bit integers. If you use memoization, it sure works even for much bigger numbers, but then you need dynamically allocated memory.

I have not yet proven the runtime complexity of this, but it sure is damn fast if memoization is used. I think it's O(log² n) additions and would be O(log n * log log n) if we precompute the Fibonacci numbers up to n and do a binary search for i. Not sure about the case without memoization though, it doesn't seem to work well with n beyond 2³².

Here are some values of F in case you're interested, computed with a memoized version of the above function in Python:

F(0) = 1
F(1) = 2
F(2) = 2
F(3) = 3
F(4) = 3
F(5) = 3
F(6) = 4
F(7) = 3
F(8) = 4
F(9) = 5
F(10) = 4
F(11) = 5
F(12) = 4
F(13) = 4
F(14) = 6
F(4079078553298575003715036404948112232583483826150114126141906775660304738681982981114711241662261246) = 70875138187634460005150866420231716864000000
F(2093397132298013861818922996230028521104292633652443820564201469339117288431349400794759495467500744) = 157806495228764859558469497848542003200000000
F(1832962638825344364456510906608935117588449684478844475703210731222814604429713055795735059447061144) = 9556121706647393773891318116777984000000000
F(6529981124822323555642594388843027053160471595955101601272729237158412478312608142562647329142961542) = 7311968902691913059222356326906593280000000
F(3031139617090050615428607946661983338146903521304736547757088122017649742323928728412275969860093980) = 16200410965370556030391586130218188800000000
F(4787808019310723702107647550328703908551674469886971208491257565640200610624347175457519382346088080) = 7986384770542363809305167745746206720000000
F(568279248853026201557238405432647353484815906567776936304155013089292340520978607228915696160572347) = 213144111166298008572590523452227584000000000
F(7953857553962936439961076971832463917976466235413432258794084414322612186613216541515131230793180511) = 276031486797406622817346654015856836608000000
F(2724019577196318260962320594925520373472226823978772590344943295935004764155341943491062476123088637) = 155006702456719127405700915456167116800000000
F(4922026488474420417379924107498371752969733346340977075329964125801364261449011746275640792914985997) = 3611539307706585535227777776416785118003200
F(10^1000) = 1726698225267318906119107341755992908460082681412498622901372213247990324071889761112794235130300620075124162289430696248595221333809537338231776141120533748424614771724793270540367766223552120024230917898667149630483676495477354911576060816395737762381023625725682073094801703249961941588546705389069111632315001874553269267034143125999981126056382866927910912000000000000000000000000000000000000000000000000000000000000000000000000000000

We observe that it seems like F(n) = Θ(sqrt(n)), another result that I haven't yet proven.

UPDATE: Here's the Python code:

memo = {}
def F(n, x=True):
    if n == 0: return 1
    if (n, x) in memo: return memo[n,x]
    i = 1
    a, b = 1, 1
    while b + a <= n:
        a, b = b, a + b
    memo[n,x] = (F(n - b) if x else 0) + F(n - a, n - a < a)
    return memo[n,x]

UPDATE 2: You can get better runtime even without memoization by using binary search to find i and computing Fib(i) using fast matrix exponentiation. Probably not worth the effort though, especially not for 32-bit n.

UPDATE 3: Just for fun, here is an implementation that provably does only O(log n) additions:

fib = [0,1]
def greedy(n):
    while fib[-1] < n:
        fib.append(fib[-1] + fib[-2])
    i = 1
    while fib[i+1] <= n: i += 1
    digs = set()
    while n:
        while fib[i] > n: i -= 1
        digs.add(i)
        n -= fib[i]
    return digs

def F(n):
    digs = greedy(n)
    top = max(digs)
    dp = [[[0,0,0] for _ in xrange(4)] for _ in xrange(top+1)]
    for j in xrange(0, 2): dp[0][j][0] = 1
    for i in xrange(1, top + 1):
        for j in xrange(0,2):
            for k in xrange(0,j+1):
                if i in digs:
                    dp[i][j][k] = dp[i-1][k+j][j] + dp[i-1][k+j+1][j+1]
                else:
                    dp[i][j][k] = dp[i-1][k+j][j] + dp[i-1][k+j-1][j-1]
    return dp[top][0][0]

It first finds a greedy representation of the number in the fibonacci base and then uses DP to find the number of ways to carry over digits in that representation to build up the final number. dp[i,j,k] is the number of ways to represent the prefix 1..i of the number in fibonacci base if we have carry j onto position i and carry k onto position i - 1. Using this, we can compute F(10^50000) in under 5 seconds (the result has more than 20000 decimal digits!)

Question 2

I was intrigued by two aspects of Niklas B.'s answer: the speed of the computation (even for huge numbers), and the tendency of the results to have small prime factors. These hint that the solution can be computed as a product of small terms, and that indeed turns out to be the case.

To explain what's going on, I need some notation and terminology. For any nonnegative integer n, I'll define the (unique) greedy representation of n to be the sum of Fibonacci numbers obtained by repeatedly taking the largest Fibonacci number not exceeding n. So for example, the greedy representation of 10 is 8 + 2. It's easy to observe that we never use Fib(1) in such a greedy representation.

I also want a compact way to write these representations, and for that I'm going to use bitstrings. Much like binary, except that the place values follow the Fibonacci sequence instead of the sequence of powers of 2, and I'll write with the least significant digit first. So for example 00100001 has 1s in position 2 and position 7, so represents Fib(2) + Fib(7) = 1 + 13 = 14. (Yes, I'm starting counting at 0, and following the usual convention that Fib(0) = 0.)

A brute force way of finding all representations is to start with the greedy representation and then explore all possibilities for rewriting a subpattern of the form 001 as a pattern of the form 110; i.e., replacing Fib(k+2) with Fib(k) + Fib(k+1) for some k.

So we can always write the greedy representation of n as a bitstring, and that bitstring will be a sequence of 0s and 1s, with no two adjacent 1s. Now the key observation is that we can partition this bitstring into pieces and compute the number of rewrites for each separate piece, multiplying to get the total number of representations. This works because certain subpatterns in the bitstring prevent interaction between the rewrite rules for the portion of the string to the left of the pattern, and those to the right.

For an example, let's look at n = 78. Its greedy representation is 00010000101, and a brute-force approach quickly identifies the full set of representations. There are ten of them:

00010000101
01100000101
00010011001
01100011001
00011101001
01101101001
0001001111
0110001111
0001110111
0110110111

We can separate the first part of the pattern, 0001, from the second, 0000101. Each of the combinations above comes from rewriting 0001, separately rewriting 0000101, and glueing the two rewrites together. There are 2 rewrites (including the original) for the left portion of the pattern, and 5 for the right, so we get 10 representations overall.

What makes this work is that any rewrite of the left half, 0001, ends with either 01 or 10, while any rewrite of the right half starts with 00 or 11. So there's no match for either 001 or 110 that overlaps the boundary. We'll get this separation whenever we have two 1s separated by an even number of zeros.

And this explains the small prime factors seen in Niklas's answer: in a randomly-chosen number, there will be many sequences of 0s of even length, and each of those represents a point where we can split the computation.

The explanations are becoming a little convoluted, so here's some Python code. I've verified that the results agree with Niklas's for all n up to 10**6, and for a selection of randomly chosen large n. It should have the same algorithmic complexity.

def combinations(n):
    # Find Fibonacci numbers not exceeding n, along with their indices.
    # We don't need Fib(0) or Fib(1), so start at Fib(2).
    fibs = []
    a, b, index = 1, 2, 2
    while a <= n:
        fibs.append((index, a))
        a, b, index = b, a + b, index + 1

    # Compute greedy representation of n as a sum of Fibonacci numbers;
    # accumulate the indices of those numbers in indices.
    indices = []
    for index, fib in reversed(fibs):
        if n >= fib:
            n -= fib
            indices.append(index)
    indices = indices[::-1]

    # Compute the 'signature' of the number: the lengths of the pieces
    # of the form 00...01.
    signature = [i2 - i1 for i1, i2 in zip([-1] + indices[:-1], indices)]

    # Iterate to simultaneously compute total number of rewrites,
    # and the total number with the top bit set.
    total, top_set = 1, 1
    for l in signature:
        total, top_set = ((l + 2) // 2 * total - (l + 1) % 2 * top_set, total)

    # And return the grand total.
    return total

EDIT: Code substantially simplified.

EDIT 2: I just came across this answer again, and suspected that there was a more straightforward way. Here's yet another code simplification, showing clearly that O(log n) operations are required.

def combinations(n):
    """Number of ways to write n as a sum of positive
    Fibonacci numbers with distinct indices.
    """
    # Find Fibonacci numbers not exceeding n.
    fibs = []
    fib, next_fib = 0, 1
    while fib <= n:
        fibs.append(fib)
        fib, next_fib = next_fib, fib + next_fib

    # Compute greedy representation, most significant bit first.
    greedy = []
    for fib in reversed(fibs):
        greedy.append(fib <= n)
        if greedy[-1]:
            n -= fib

    # Iterate to compute number of rewrites.
    x, y, z = 1, 0, 0
    for bit in reversed(greedy):
        x, y, z = (0, y + z, x) if bit else (x + z, z, y)
    return y + z

Question 3

You can find the lexicographically largest 0/1 representation in Fibonacci-base by taking the largest Fibonacci number less than or equal your number, subtract that and then take the next largest Fibonacci number less than or equal to the remainder, etc. Then the question is how to find all other 0/1 representations in Fibonacci-base from the lexicographically largest one. I believe you can use the Fibonacci recurrence relation to do this. For example if your representation is 1100... then you can replace the 2nd largest Fib number in the representation with the sum of the next two, giving 1011.....If you recursively process the string this way from left to right repeatedly, either choosing to make the substitution or not whenever you can, and using dynamic programming to remember which representations you have already explored, I believe you will get all the representations, and in O(m log n) time where m is the total number of Fibonacci representations for your number n. I'll update if I find a definitive proof of that. Meanwhile you can check the conjecture for numbers up to about a million or so. If it checks out for all those cases, then it's almost certainly true in general.

Question 4

One of naive possibilities in python (works up to 10^6 in reasonable time)

def nfibhelper(fibm1,fibm2,n):
    fib = fibm1 + fibm2
    if fib > n:
        return 0
    r=0
    if n == fib :
        r+=1
    return r + nfibhelper(fibm2,fib,n-fib) + nfibhelper(fibm2,fib,n)


def F(n):
    return nfibhelper(1,0,n) ##1 will be used twice as fib