Why does in my case value typed variable slower than reference type?

https://stackoverflow.com/questions/18552024

26-06-2022
|

質問

Value typed structs are stack based variables that reside in processor cache rather than RAM. So by avoiding trip via system bus from processor to RAM using value-typed variables should be faster than using reference type variables. So instead of doing stackallock (Practical use of `stackalloc` keyword) I've created a following structure that represent stack-based array:

[StructLayout(LayoutKind.Sequential)]
public struct ArrayOf16Bytes
{
    public Byte e0;
    public Byte e1;
    public Byte e2;
    public Byte e3;
    public Byte e4;
    public Byte e5;
    public Byte e6;
    public Byte e7;
    public Byte e8;
    public Byte e9;
    public Byte eA;
    public Byte eB;
    public Byte eC;
    public Byte eD;
    public Byte eE;
    public Byte eF;

    public byte this[Int32 index] {
        get {
            switch (index) {
            case 0x0:
                return e0;
            case 0x1:
                return e1;
            case 0x2:   
                return e2;
            case 0x3:
                return e3;
            case 0x4:
                return e4;
            case 0x5:
                return e5;
            case 0x6:
                return e6;
            case 0x7:
                return e7;
            case 0x8:
                return e8;
            case 0x9:
                return e9;
            case 0xA:
                return eA;
            case 0xB:
                return eB;
            case 0xC:
                return eC;
            case 0xD:
                return eD;
            case 0xE:
                return eE;
            case 0xF:
                return eF;
            default:
                throw new IndexOutOfRangeException ();
            }
        }
        set {
            switch (index) {
            case 0x0:
                e0 = value;
                break;
            case 0x1:
                e1 = value;
                break;
            case 0x2:
                e2 = value;
                break;
            case 0x3:
                e3 = value;
                break;
            case 0x4:
                e4 = value;
                break;
            case 0x5:
                e5 = value;
                break;
            case 0x6:
                e6 = value;
                break;
            case 0x7:
                e7 = value;
                break;
            case 0x8:
                e8 = value;
                break;
            case 0x9:
                e9 = value;
                break;
            case 0xA:
                eA = value;
                break;
            case 0xB:
                eB = value;
                break;
            case 0xC:
                eC = value;
                break;
            case 0xD:
                eD = value;
                break;
            case 0xE:
                eE = value;
                break;
            case 0xF:
                eF = value;
                break;
            default:
                throw new IndexOutOfRangeException ();
            }
        }
    }
}

case should be compiled to jump table, since cmp and jump are one-cycle instructions (Is there any significant difference between using if/else and switch-case in C#?), the first piece of code should be much more faster than the second

works slower than the actual array in the following example:

[Test]
public void TestStackArrayPerformance() {
    var random = new Xor128 ();

    byte[] x = new byte[16];

    ArrayOf16Bytes p = new ArrayOf16Bytes ();

    for (int i = 0; i < 16; i++) {
        x [i] = p [i] = random.As<IUniform<byte>> ().Evaluate ();
    }

    var index = random.As<IUniform<Int32>> ().Evaluate (0, 15);

    var timer = DateTime.Now;
    for (int i = 0; i < 1000000000; i++) {
        var t = x [i & 0xF];
        x [i & 0xF] = x [index];
        x [index] = t;
    }
    Console.WriteLine ("Spinup took: {0}", DateTime.Now - timer);

    timer = DateTime.Now;
    for (int i = 0; i < 1000000000; i++) {
        var t = x [i & 0xF];
        x [i & 0xF] = x [index];
        x [index] = t;
    }
    Console.WriteLine ("Operation 1 took: {0}", DateTime.Now - timer);


    timer = DateTime.Now;
    for (int i = 0; i < 100000000; i++) {
        var t = p [i & 0xF];
        p [i & 0xF] = p [index];
        p [index] = t;
    }
    Console.WriteLine ("Operation 2 took: {0}", DateTime.Now - timer);

}

On my machine this piece of code shows the following results:

Spinup took: 00:00:00.3005500
Operation 1 took: 00:00:00.2959800
Operation 2 took: 00:00:04.4344340

解決

I'm not an expert in this subject, but I believe you have some false assumptions here:

Value typed structs are stack based variables that reside in processor cache rather than RAM. So by avoiding trip via system bus from processor to RAM using value-typed variables should be faster than using reference type variables.

Just because something is a reference type doesn't mean that the CPU cache won't be used. The stack isn't the only area of memory that can use the cache. In addition, the CPU is pretty smart at things like pre-fetching to cache, so you don't typically have to micro-manage your memory like this.

Also, keep in mind that as you access your struct within your loop, it's not just the instructions within the getter and setter you have to worry about; there's also overhead anytime you do a method call, which includes indexers. A method call involves pushing the parameters to the stack, doing the jump to the method instructions, pushing the return value to the stack, doing the return jump, etc. So it's no surprise that this is more costly than the simple instructions for setting an array value.

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow