Question

I'm trying to find maximum the Collatz sequence between 1 and 1000000. I wrote the following code below. I guess it is correct but it is extremely slow. Can you give me a few hints to make it faster? Thanks.

#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
bool myfn(int i, int j) { return i<j; }

int collatz(int x);

int main()
{
    vector <int> myvector;
    for(int i = 1; i < 1000000; i++)
    {
        myvector.push_back(collatz(i));
    }

    cout<<*max_element(myvector.begin(),myvector.end(),myfn);


    return 0;

}

int collatz(int x)
{
    int counter = 1;


    while(1)
    {
        if(x == 1)
            break;

        if(x % 2 == 0)
        {
            x = x / 2;
            counter++;
        }
        else
        {
            x = 3 * x + 1;
            counter++;
        }
    }

    return counter;
}
Was it helpful?

Solution 2

Here are some tips:

Initialize vector to your capacity.
The vector may be expanding while you are pushing elements. Worst case, one reallocation per push_back.

Treat vector as array
After you have initialized your vector to its capacity, you can use the operator [] to access the vector slot instead of calling push_back.

Optimize collatz
This is probably your bottleneck. Try placing into a separate file and cranking up the compiler optimizations on it.

There may be a more optimal algorithm.

OTHER TIPS

Actually, I just tested, and your code isn't just "extremely slow", but infinitely looping. No, you didn't just disprove the Collatz conjecture, but as far as I can see, you're suffering from an integer overflow.

More specifically, during the collatz(113383) the int x variable becomes negative due to an overflow:

551580299 1654740898 827370449 -1812855948 -906427974 -453213987

Shortly after that, it starts looping on the following sequence and thus consequently never exits the loop

-37 -110 -55 -164 -82 -41 -122 -61 -182 -91 -272 -136 -68 -34 -17 -50 -25 -74 -37 ...

After I changed the int x argument to a long long, the program quite quickly finishes.

The question then of course still is whether no overflows occur that don't cause infinite loops and go unnoticed, as you would then still end up with the wrong answer. However, after putting

if (x < 0) {
    cout << "ERROR" << endl;
}

at the end of the while(1) loop and seeing nothing being output extra, I think you can be confident that a long long is big enough for the Collatz sequences for the numbers 1 to 1000000 (8 bytes, on my computer, compared to 4 bytes for an int, if you'd be interested).

EDIT: As a side note: I don't see any reason to keep a vector of all results if you only need the maximum. Following code would eat less memory:

int maxCollatz = 0;
for(int i = 1; i < 1000000; i++) {
    if (i % 10000 == 0)
        cout << i << endl;
    maxCollatz = max(maxCollatz, collatz(i));
}

And another note: you're actually only looping from 1 to 999999, so if the Collatz sequence for 1000000 would be the longest one, you might be missing that one...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top