Question

My current task is to optimise a Monte Carlo Simulation that calculates Capital Adequacy figures by region for a set of Obligors.

It is running about 10 x too slow for where it will need to be in production and number or daily runs required. Additionally the granularity of the result figures will need to be improved down to desk possibly book level at some stage, the code I've been given is basically a prototype which is used by business units in a semi production capacity.

The application is currently single threaded so I'll need to make it multi-threaded, may look at System.Threading.ThreadPool or the Microsoft Parallel Extensions library but I'm constrained to .NET 2 on the server at this bank so I may have to consider this guy's port, http://www.codeproject.com/KB/cs/aforge_parallel.aspx.

I am trying my best to get them to upgrade to .NET 3.5 SP1 but it's a major exercise in an organisation of this size and might not be possible in my contract time frames.

I've profiled the application using the trial of dotTrace (http://www.jetbrains.com/profiler). What other good profilers exist? Free ones?

A lot of the execution time is spent generating uniform random numbers and then translating this to a normally distributed random number. They are using a C# Mersenne twister implementation. I am not sure where they got it or if it's the best way to go about this (or best implementation) to generate the uniform random numbers. Then this is translated to a normally distributed version for use in the calculation (I haven't delved into the translation code yet).

Also what is the experience using the following?

Any alternatives you know of? I'm a C# developer so would prefer C#, but a wrapper to C++ shouldn't be a problem, should it?

Maybe even faster leveraging the C++ implementations. I am thinking some of these libraries will have the fastest method to directly generate normally distributed random numbers, without the translation step. Also they may have some other functions that will be helpful in the subsequent calculations.

Also the computer this is on is a quad core Opteron 275, 8 GB memory but Windows Server 2003 Enterprise 32 bit. Should I advise them to upgrade to a 64 bit OS? Any links to articles supporting this decision would really be appreciated.

Anyway, any advice and help you may have is really appreciated.

Was it helpful?

Solution

I have found the Mersenne Twister to be quick. The problem may be in the algorithm (Box-Muller) to transform the uniform distrubution to Gaussian distribution. The standard algorithm looks like:

y1 = sqrt( - 2 ln(x1) ) cos( 2 pi x2 )
y2 = sqrt( - 2 ln(x1) ) sin( 2 pi x2 )

Where x1 and x2 are uniform random numbers and y1 and y2 are the gaussian distribution outputs.

The square roots are slow, but the trig is worse, and it is unstable close to 0. Taygeta's page on the subject gives a faster one (in pseudocode):

         float x1, x2, w, y1, y2;

     do {
             x1 = 2.0 * ranf() - 1.0;
             x2 = 2.0 * ranf() - 1.0;
             w = x1 * x1 + x2 * x2;
     } while ( w >= 1.0 );

     w = sqrt( (-2.0 * ln( w ) ) / w );
     y1 = x1 * w;
     y2 = x2 * w;

If they're not using something like this, you may be able to speed things up quite a bit by avoiding the trig functions or even pre-generating the random numbers.

OTHER TIPS

Have you considered pointing a profiler at your code? I've seen cases where there are simple fixes get very significant improvements. Like switching a couple of properties over to fields.

Being constrained to use .Net in the first place for a large-scale simulation is going to cost you quite a bit of performance right up front...but that said...

If you're running a pure C# implementation of the Mersenne Twister, it's likely that you'll have a hard time tweaking all the performance you can out of it. If you check out the Mersenne Twister reference implementation you'll see they have a C version that is heavily optimized for SSE-capable processors - this is very fast. I don't believe it's possible in C# (or at least, I'm not aware how) to force the usage of SSE instructions with that level of optimization. I'd suggest writing a C++/CLI wrapper (or a P/Invoke wrapper) around the Mersenne Twister libraries, and seeing how that affects your performance. However, you'll have to be careful with managed-unmanaged marhsalling affecting your performance, as I have seen other posts here on SO about that issue (though I can't seem to find them right now...).

I may generate some flame for saying this, but if performance is a significant concern in your application, well-written C or C++ is almost always going to be preferable to any managed or interpreted language.

My experience is that the relative performance of C# vs. C++ is largely dependent on what you're doing. A great discussion of that here:

C++ performance vs. Java/C#

For tight loops doing math (say vector physics calculations) c++ is a 2-3 times faster than C# although the perf may be dominated by the underlying functions like Sqrt().

I've taken a mixed language approach, (re)implementing the slowest code in C++/OpenMP with a managed C++/CLI wrapper. This allows you to only "pay for what you use".

There's a summary of how to wrap native C/C++ with C++/CLI here:

http://msdn.microsoft.com/en-us/library/ms235281.aspx

Once you get the hang of C++/CLI it's pretty easy to get things running.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top