Optimizations with value types and method calls using ref

Question

Your Particle type will have a memory cost of at least 20 bytes.

If you don't use ref, your method will look like:

public static Particle UpdateParticle(Particle p)
{
    p.Position += p.Velocity;
    return p;
}

If you do this, you will add a Particle copy to pass the Particle to the UpdateParticle() method, and a Particle copy to return the copy. Because copy represents read & write a particle copy requires 40 bytes memory access (and 2 copies represents 80 bytes accessed).

Updating 100 000 particle without using ref require to copy near 8 000 000 bytes copied.

With common Core i5/i7 processors, memory bandwidth is between 15 and 40 GB/s. The memory access overhead of not using ref can be estimated to 8/20 000 = 0,4 ms. With my Core 2 with 6 Gb/s memory bandwidth : 8 / 6 000 = 1,3 ms)

To measure the delta you have to deal with at least 10 000 000 Particles (100 times more = 130 ms on my computer).

On my computer measures indicate near 600 ms for the copy version, and near 200 ms for the ref version.

I also added an Update() member method to the Particle struct and updating 10 000 000 particles with this method also takes near 200 ms.

Finally I tried to do the add operation directly in loop (without method calls). It took only near 150 ms (a 50 ms gain)

A total delta of 400 ms (a little less than a second half) is near 3 times more than the single memory bandwith overhead.

This additional overhead is due the JIT compiler.

Finally if the usage of ref is discouraged in .NET Framework, it is very interesting when you use relatively large structs.

In C++ you can obtain by far better performances. We can expect in the future that intelligent C# compiler will provide better performances (especially with automatic method inlining that can remove the memory copy and method call overheads).

In C# the strict isolation of reference types (classes) and value types (structs) is problematic when you deal with a huge number of instances: 1) using classes you face the long time of unitary heap memory allocations. 2) using struct you face the long time of automatic copies.

To design your program you have to make a balance between a good program design (implying many method calls and object copies) and the performances expected. To be able to optimize and change the design of your particles model, it could be a good idea to create a Particles class managing the particles collections in your system and providing optimized large set of particles operations.