Why does an enum.ToString box/callvirt, instead of pushing address and call? Are there any other special cases?

StackOverflow https://stackoverflow.com/questions/22711443

  •  23-06-2023
  •  | 
  •  

Question

I have this framework I wrote a couple months ago that generates a class for calling this performance service. Consumers of the framework create an interface with methods, annotate with attributes, and call a factory method that creates the implementation of the interface they can use to call this performance service. The service only supports two data string and long. I use reflection emit with collectable assemblies to generate the class that implements the interface.

Everything has been working good, but today someone told me they were getting an AV when they tried to pass in a enum that would be converted to a string. In the code there is a check to see if the type is a value type and if so push the address (ldarga or ldflda depending on the interface the consumer created) and then call ToString. So I created a little debug app, and I saw the C# compiler will box an enum and then call ToString on the boxed enum.

So like I am kinda confused. Is the way I am handling value types incorrect? Is the IL the C# compiler generates for toString on enum the correct way to do? Are there any other special cases like this

Update with answer: so it looks like i need to see if the value type implements tostring and if it doesnt box. For value types i guess this applies to the object methods, tostring, gethashcode, equals.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Reflection.Emit;
using System.Reflection;

namespace ConsoleApplication15
{
    public struct H
    {
    }
    class Program
    {
        static void Main(string[] args)
        {
            //Test<AttributeTargets>(AttributeTargets.ReturnValue); //-- fails
            //Test<int>(10); //-- works
           // TestBox<AttributeTargets>(AttributeTargets.ReturnValue); //-- works
            //Test<H>(new H()); // fails
            TestCorrect<H>(new H()); // works 
            TestCorrect<int>(10); // works 

            Console.ReadLine();
        }

        private static void TestCorrect<T>(T t)
    where T : struct
        {
            MethodInfo method = typeof(T).GetMethod(
                "ToString",
                BindingFlags.Public | BindingFlags.Instance,
                null,
                Type.EmptyTypes,
                null);

            var m = new DynamicMethod("x", typeof(string), new[] { typeof(T) });
            var i = m.GetILGenerator();
            if (method.DeclaringType == typeof(T))
            {
                i.Emit(OpCodes.Ldarga, 0);
                i.Emit(OpCodes.Call, method);
            }
            else
            {
                i.Emit(OpCodes.Ldarg_0);
                i.Emit(OpCodes.Box, typeof(T));
                i.Emit(OpCodes.Callvirt, method);
            }

            i.Emit(OpCodes.Ret);
            string result = (m.CreateDelegate(typeof(Func<T, string>)) as Func<T, string>)(t);

            Console.WriteLine(result);
        }

        private static void Test<T>(T t)
            where T : struct
        {
            MethodInfo method = typeof(T).GetMethod(
                "ToString",
                BindingFlags.Public | BindingFlags.Instance,
                null,
                Type.EmptyTypes,
                null);

            var m = new DynamicMethod("x", typeof(string), new[] { typeof(T) });
            var i = m.GetILGenerator();
            i.Emit(OpCodes.Ldarga, 0);
            i.Emit(OpCodes.Call, method);
            i.Emit(OpCodes.Ret);
            string result = (m.CreateDelegate(typeof(Func<T, string>)) as Func<T, string>)(t);

            Console.WriteLine(result);
        }

        private static void TestBox<T>(T t)
            where T : struct
        {
            // this is how the C# compiler call to string on enum.
            MethodInfo method = typeof(T).GetMethod(
                "ToString",
                BindingFlags.Public | BindingFlags.Instance,
                null,
                Type.EmptyTypes,
                null);

            var m = new DynamicMethod("x", typeof(string), new[] { typeof(T) });
            var i = m.GetILGenerator();
            i.Emit(OpCodes.Ldarg_0);
            i.Emit(OpCodes.Box, typeof(T));
            i.Emit(OpCodes.Callvirt, method);
            i.Emit(OpCodes.Ret);
            string result = (m.CreateDelegate(typeof(Func<T, string>)) as Func<T, string>)(t);

            Console.WriteLine(result);
        }
    }
}
Was it helpful?

Solution

An enumeration type doesn't override its ToString() method, so for any enumeration type e, e.ToString() resolves to Enum.ToString. This method is defined on a reference type (Enum is a reference type), so to call this method, the implicit this argument needs to be a boxed value.

Most other value types, such as int, do provide an overridden ToString method directly on the value type itself.

From the specification:

I.12.1.6.2.4 Calling methods

Static methods on value types are handled no differently from static methods on an ordinary class: use a call instruction with a metadata token specifying the value type as the class of the method. Non-static methods (i.e., instance and virtual methods) are supported on value types, but they are given special treatment. A non-static method on a reference type (rather than a value type) expects a this pointer that is an instance of that class. This makes sense for reference types, since they have identity and the this pointer represents that identity. Value types, however, have identity only when boxed. To address this issue, the this pointer on a non-static method of a value type is a byref parameter of the value type rather than an ordinary by-value parameter.

A non-static method on a value type can be called in the following ways:

  • For unboxed instances of a value type, the exact type is known statically. The call instruction can be used to invoke the function, passing as the first parameter (the this pointer) the address of the instance. The metadata token used with the call instruction shall specify the value type itself as the class of the method.

  • Given a boxed instance of a value type, there are three cases to consider:

    • Instance or virtual methods introduced on the value type itself: unbox the instance and call the method directly using the value type as the class of the method.

    • Virtual methods inherited from a base class: use the callvirt instruction and specify the method on the System.Object, System.ValueType or System.Enum class as appropriate.

    • Virtual methods on interfaces implemented by the value type: use the callvirt instruction and specify the method on the interface type.

OTHER TIPS

Consider:

using System;

class Program
{
    static void Main()
    {
        TestFoo(Foo.A);
        TestEnum(Foo.B);
        TestGenerics(Foo.C);
    }
    static string TestFoo(Foo foo)
    {
        return foo.ToString();
    }
    static string TestEnum(Enum foo)
    {
        return foo.ToString();
    }
    static string TestGenerics<T>(T foo)
    {
        return foo.ToString();
    }
}

enum Foo
{
    A, B, C
}

This generates the IL:

.class private auto ansi sealed Foo
    extends [mscorlib]System.Enum
{
    .field public static literal valuetype Foo A = int32(0)

    .field public static literal valuetype Foo B = int32(1)

    .field public static literal valuetype Foo C = int32(2)

    .field public specialname rtspecialname int32 value__

}

Note that Foo does not override ToString(). The compiler potentially could, but: it doesn't. This is the single main reason for seeing a box - if a struct doesn't override an object. method, you can't call that method without boxing it. Simple as. Well, not quite that simple - the compiler can also use constrained call, which defers that final decision to the JIT: if the type overrides the method, it will call it direct - otherwise it will box. This is exactly what the compiler does emit in the two cases where it isn't already a reference-type (note: anything passed as Enum is already boxed; Enum is a reference-type):

.method private hidebysig static string TestFoo(valuetype Foo foo) cil managed
{
    .maxstack 8
    L_0000: ldarga.s foo
    L_0002: constrained. Foo
    L_0008: callvirt instance string [mscorlib]System.Object::ToString()
    L_000d: ret 
}

# NOTE: in this example, foo is **already** boxed before it comes in, hence
# no attempt at constrained-call
.method private hidebysig static string TestEnum(class [mscorlib]System.Enum foo) cil managed
{
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: callvirt instance string [mscorlib]System.Object::ToString()
    L_0006: ret 
}
.method private hidebysig static string TestGenerics<T>(!!T foo) cil managed
{
    .maxstack 8
    L_0000: ldarga.s foo
    L_0002: constrained. !!T
    L_0008: callvirt instance string [mscorlib]System.Object::ToString()
    L_000d: ret 
}

Now, it could be that the JIT and CLI work some voodoo there that makes the constrained call work despite the presence of an overload, but: if it doesn't, at least this explains (hopefully) why it is boxing, and demonstrates that the compiler tried really quite hard to get it to not box (via the constrained call).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top