boost::uniform_on_sphere suddenly fails after a few million correct realizations, but only on certain hosts

https://stackoverflow.com/questions/13763946

05-12-2021
|

Question

The Problem

After correctly generating random vectors in 2 dimensions for a while, the boost::uniform_on_sphere distribution suddenly generates a vector with values -nan. I have tested the included program on three machines- the error was observed on two of them, but not on the third. Does anyone have an idea of what could be going on?

edit: It occurs on all hosts if the same type is used.

The Hosts

Host 1

AMD Opteron(tm) Processor 6174
g++ (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)
fails after 3802480 realizations

Host 2

Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
g++ (GCC) 4.7.2 20120921 (Red Hat 4.7.2-2)
fails after 3802480 realizations

Host 3

Intel(R) Atom(TM) CPU D2700 @ 2.13GHz
g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
fails after 3802480 realizations, with slightly different output

Realizations

On Hosts 1 and 2:

3802470: -4.8961880803e-01 , -8.7193667889e-01
3802471: 9.9225074053e-01 , -1.2425158173e-01
3802472: 6.5411877632e-01 , -7.5639182329e-01
3802473: -9.8332953453e-01 , -1.8183253706e-01
3802474: 7.1217632294e-01 , -7.0200067759e-01
3802475: -9.9968332052e-01 , 2.5166392326e-02
3802476: 9.9412262440e-01 , 1.0826008022e-01
3802477: -6.2786966562e-01 , 7.7831840515e-01
3802478: 5.7143938541e-01 , 8.2064425945e-01
3802479: 5.8261138201e-01 , 8.1275087595e-01
3802480: -nan , -nan
3802481: 9.2151606083e-01 , 3.8834002614e-01
3802482: 8.6448800564e-01 , -5.0265353918e-01
3802483: -9.1891586781e-01 , 3.9445358515e-01
3802484: -9.1544634104e-01 , 4.0244001150e-01

On Host 3, using float instead of float_t:

3802470: -4.8961877823e-01 , -8.7193661928e-01
3802471: 9.9225074053e-01 , -1.2425158918e-01
3802472: 6.5411871672e-01 , -7.5639182329e-01
3802473: -9.8332953453e-01 , -1.8183253706e-01  <- exactly the same as above
3802474: 7.1217626333e-01 , -7.0200061798e-01
3802475: -9.9968332052e-01 , 2.5166388601e-02
3802476: 9.9412262440e-01 , 1.0826008022e-01
3802477: -6.2786966562e-01 , 7.7831846476e-01
3802478: 5.7143932581e-01 , 8.2064431906e-01   <- slightly different
3802479: 5.8261138201e-01 , 8.1275087595e-01   <- exactly the same
3802480: -nan , -nan
3802481: 9.2151612043e-01 , 3.8834002614e-01
3802482: 8.6448800564e-01 , -5.0265347958e-01
3802483: -9.1891586781e-01 , 3.9445355535e-01
3802484: -9.1544634104e-01 , 4.0244001150e-01

The Program

This was compiled simply with g++ bug.cpp. Turning on -O3 optimizations did not change the result.

#include <boost/circular_buffer.hpp>
#include <boost/random/variate_generator.hpp>
#include <boost/random/uniform_on_sphere.hpp>
#include <boost/random.hpp>
#include <iostream>
#include <fstream>
using namespace std;

int main(int argc, const char *argv[])
{
    typedef boost::mt19937                                              GeneratorType;
    typedef boost::uniform_on_sphere<float_t>                           DistributionType;
    typedef boost::variate_generator<GeneratorType, DistributionType >  VariateType;
    typedef boost::circular_buffer<DistributionType::result_type>       BufferType;
    GeneratorType       gen;
    DistributionType    dist(2);
    VariateType         variate(gen,dist);
    const int           BUFSIZE = 10;

    gen.seed(11);
    BufferType buf(BUFSIZE);
    long n(0);
    while (1){
        cout << "n: " << n << "\r" << flush;
        DistributionType::result_type tmp = variate();
        buf.push_back(tmp);
        if (isnan(tmp[0])) {
            cout << "n: " << n << "       " << endl;
            cout << tmp[0] << " , " << tmp[1] << endl;
            ofstream fout("debug.out");
            for (int i=0; i<BUFSIZE; i++)
                fout << buf[i][0] << " " << buf[i][1] << endl;
            fout.close();
            ofstream gen_out("gen.out");
            gen_out << gen;
            gen_out.close();
            exit(1);
        }
        n++;
    }

    return 0;
}

I would really appreciate any help!

Solution

This seems to occur due to a bug in boost/random/uniform_on_sphere.hpp. In N dimensions, N normally-distributed numbers are generated as the components of an N-vector which is then normalized. In 2d, the probability of getting two zeros is apparently not negligible, resulting in the computation 0/0=NaN for each component due to the normalization.

A workaround would be to just program this distribution by hand for small dimensions.

OTHER TIPS

One big difference between x86-32 and x86-64 is the difference between x87 (80 bits FP) and SSE (64 bits FP). I.e. the 32 bits Atom has 16 extra bits to work in.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow