Question

Edit: Comments at bottom. Also, this.


Here's what's kind of confusing me. My understanding is that if I have an enum like this...

enum Animal
{
    Dog,
    Cat
}

...what I've essentially done is defined a value type called Animal with two defined values, Dog and Cat. This type derives from the reference type System.Enum (something which value types can't normally do—at least not in C#—but which is permitted in this case), and has a facility for casting back and forth to/from int values.

If the way I just described the enum type above were true, then I would expect the following code to throw an InvalidCastException:

public class Program
{
    public static void Main(string[] args)
    {
        // Box it.
        object animal = Animal.Dog;

        // Unbox it. How are these both successful?
        int i = (int)animal;
        Enum e = (Enum)animal;

        // Prints "0".
        Console.WriteLine(i);

        // Prints "Dog".
        Console.WriteLine(e);
    }
}

Normally, you cannot unbox a value type from System.Object as anything other than its exact type. So how is the above possible? It is as if the Animal type is an int (not just convertible to int) and is an Enum (not just convertible to Enum) at the same time. Is it multiple inheritance? Does System.Enum somehow inherit from System.Int32 (something I would not have expected to be possible)?

Edit: It can't be either of the above. The following code demonstrates this (I think) conclusively:

object animal = Animal.Dog;

Console.WriteLine(animal is Enum);
Console.WriteLine(animal is int);

The above outputs:

True
False

Both the MSDN documentation on enumerations and the C# specification make use of the term "underlying type"; but I don't know what this means, nor have I ever heard it used in reference to anything other than enums. What does "underlying type" actually mean?


So, is this yet another case that gets special treatment from the CLR?

My money's on that being the case... but an answer/explanation would be nice.


Update: Damien_The_Unbeliever provided the reference to truly answer this question. The explanation can be found in Partition II of the CLI specification, in the section on enums:

For binding purposes (e.g., for locating a method definition from the method reference used to call it) enums shall be distinct from their underlying type. For all other purposes, including verification and execution of code, an unboxed enum freely interconverts with its underlying type. Enums can be boxed to a corresponding boxed instance type, but this type is not the same as the boxed type of the underlying type, so boxing does not lose the original type of the enum.

Edit (again?!): Wait, actually, I don't know that I read that right the first time. Maybe it doesn't 100% explain the specialized unboxing behavior itself (though I'm leaving Damien's answer as accepted, as it shed a great deal of light on this issue). I will continue looking into this...


Another Edit: Man, then yodaj007's answer threw me for another loop. Somehow an enum is not exactly the same as an int; yet an int can be assigned to an enum variable with no cast? Buh?

I think this is all ultimately illuminated by Hans's answer, which is why I've accepted it. (Sorry, Damien!)

Was it helpful?

Solution

Yes, special treatment. The JIT compiler is keenly aware of the way boxed value types work. Which is in general what makes value types acting a bit schizoid. Boxing involves creating a System.Object value that behaves exactly the same way as a value of a reference type. At that point, value type values no longer behave like values do at runtime. Which makes it possible, for example, to have a virtual method like ToString(). The boxed object has a method table pointer, just like reference types do.

The JIT compiler knows the method tables pointers for value types like int and bool up front. Boxing and unboxing for them is very efficient, it takes but a handful of machine code instructions. This needed to be efficient back in .NET 1.0 to make it competitive. A very important part of that is the restriction that a value type value can only be unboxed to the same type. This avoids the jitter from having to generate a massive switch statement that invokes the correct conversion code. All it has to do is to check the method table pointer in the object and verify that it is the expected type. And copy the value out of the object directly. Notable perhaps is that this restriction doesn't exist in VB.NET, its CType() operator does in fact generate code to a helper function that contains this big switch statement.

The problem with Enum types is that this cannot work. Enums can have a different GetUnderlyingType() type. In other words, the unboxed value has different sizes so simply copying the value out of the boxed object cannot work. Keenly aware, the jitter doesn't inline the unboxing code anymore, it generates a call to a helper function in the CLR.

That helper is named JIT_Unbox(), you can find its source code in the SSCLI20 source, clr/src/vm/jithelpers.cpp. You'll see it dealing with enum types specially. It is permissive, it allows unboxing from one enum type to another. But only if the underlying type is the same, you get an InvalidCastException if that's not the case.

Which is also the reason that Enum is declared as a class. Its logical behavior is of a reference type, derived enum types can be cast from one to another. With the above noted restriction on the underlying type compatibility. The values of an enum type have however very much the behavior of a value type value. They have copy semantics and boxing behavior.

OTHER TIPS

Enums are specially dealt with by the CLR. If you want to go into the gory details, you can download the MS Partition II spec. In it, you'll find that Enums:

Enums obey additional restrictions beyond those on other value types. Enums shall contain only fields as members (they shall not even define type initializers or instance constructors); they shall not implement any interfaces; they shall have auto field layout (§10.1.2); they shall have exactly one instance field and it shall be of the underlying type of the enum; all other fields shall be static and literal (§16.1);

So that's how they can inherit from System.Enum, but have an "underlying" type - it's the single instance field they're allowed to have.

There is also a discussion on boxing behaviour, but it doesn't describe explicitly unboxing to the underlying type, that I can see.

Partition I, 8.5.2 states that enums are "an alternate name for an existing type" but "[f]or the purposes of matching signatures, an enum shall not be the same as the underlying type."

Partition II, 14.3 expounds: "For all other purposes, including verification and execution of code, an unboxed enum freely interconverts with its underlying type. Enums can be boxed to a corresponding boxed instance type, but this type is not the same as the boxed type of the underlying type, so boxing does not lose the original type of the enum."

Partition III, 4.32 explains the unboxing behavior: "The type of value type contained within obj must be assignment compatible with valuetype. [Note: This effects the behavior with enum types, see Partition II.14.3. end note]"

What I'm noting here is from page 38 of ECMA-335 (I suggest you download it just to have it):

The CTS supports an enum (also known as an enumeration type), an alternate name for an existing type. For the purposes of matching signatures, an enum shall not be the same as the underlying type. Instances of an enum, however, shall be assignable-to the underlying type, and vice versa. That is, no cast (see §8.3.3) or coercion (see §8.3.2) is required to convert from the enum to the underlying type, nor are they required from the underlying type to the enum. An enum is considerably more restricted than a true type, as follows:

The underlying type shall be a built-in integer type. Enums shall derive from System.Enum, hence they are value types. Like all value types, they shall be sealed (see §8.9.9).

enum Foo { Bar = 1 }
Foo x = Foo.Bar;

This statement will be false because of the second sentence:

x is int

They are the same (an alias), but their signature is not the same. Converting to and from an int isn't a cast.

From page 46:

underlying types – in the CTS enumerations are alternate names for existing types (§8.5.2), termed their underlying type. Except for signature matching (§8.5.2) enumerations are treated as their underlying type. This subset is the set of storage types with the enumerations removed.

Go back to my Foo enum earlier. This statement will work:

Foo x = (Foo)5;

If you inspect the generated IL code of my Main method in Reflector:

.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 1
.locals init (
    [0] valuetype ConsoleTesting.Foo x)
L_0000: nop 
L_0001: ldc.i4.5 
L_0002: stloc.0 
L_0003: call string [mscorlib]System.Console::ReadLine()
L_0008: pop 
L_0009: ret 
}

Note there's no cast. ldc is found on page 86. It loads a constant. i4 is found on page 151, indicating the type is a 32-bit integer. There isn't a cast!

Extracted from MSDN:

The default underlying type of the enumeration elements is int. By default, the first enumerator has the value 0, and the value of each successive enumerator is increased by 1.

So, the cast is possible, but you need to force it:

The underlying type specifies how much storage is allocated for each enumerator. However, an explicit cast is needed to convert from enum type to an integral type.

When you box your enum into object, the animal object is derived from System.Enum (the real type is known at runtime) so it's actually an int, so the cast is valid.

  • (animal is Enum) returns true: For this reason you can unbox animal into an Enum or event into an int doing an explicit casting.
  • (animal is int) returns false: The is operator (in general type check) does not check the underlying type for Enums. Also, for this reason you need to do an explicit casting to convert Enum to int.

While enum types are inherited from System.Enum, any conversion between them is not direct, but a boxing/unboxing one. From C# 3.0 Specification:

An enumeration type is a distinct type with named constants. Every enumeration type has an underlying type, which must be byte, sbyte, short, ushort, int, uint, long or ulong. The set of values of the enumeration type is the same as the set of values of the underlying type. Values of the enumeration type are not restricted to the values of the named constants. Enumeration types are defined through enumeration declarations

So, while your Animal class is derived from System.Enum, it's actually an int. Btw, another strange thing is System.Enum is derived from System.ValueType, however it's still a reference type.

A Enum's underlying type is the type used to store the value of the constants. In your example, even though you haven't explicitly defined the values, C# does this:

enum Animal : int
{
    Dog = 0,
    Cat = 1
}

Internally, Animal is made up of two constants with the integer values 0 and 1. That's why you can explicitly cast an integer to an Animal and an Animal to an integer. If you pass Animal.Dog to a parameter that accepts an Animal, what you are really doing is passing the 32bit integer value of Animal.Dog (in this case, 0). If you give Animal a new underlying type, then the values are stored as that type.

Why not... it is perfectly valid, for example, for a structure to hold an int internally, and be convertible to int with an explicit cast operator... lets simulate an Enum:

interface IEnum { }

struct MyEnumS : IEnum
{
    private int inner;

    public static explicit operator int(MyEnumS val)
    {
        return val.inner;
    }

    public static explicit operator MyEnumS(int val)
    {
        MyEnumS result;
        result.inner = val;
        return result;
    }

    public static readonly MyEnumS EnumItem1 = (MyEnumS)0;
    public static readonly MyEnumS EnumItem2 = (MyEnumS)2;
    public static readonly MyEnumS EnumItem3 = (MyEnumS)10;

    public override string ToString()
    {
        return inner == 0 ? "EnumItem1" :
            inner == 2 ? "EnumItem2" :
            inner == 10 ? "EnumItem3" :
            inner.ToString();
    }
}

This struct can be used quite the same way a struct can... of course, if you try to reflect the type, and call IsEnum property it will return false.

Let's look at some usage comparison, with the equivalent enum:

enum MyEnum
{
    EnumItem1 = 0,
    EnumItem2 = 2,
    EnumItem3 = 10,
}

Comparing usages:

Struct version:

var val = MyEnum.EnumItem1;
val = (MyEnum)50;
val = 0;
object obj = val;
bool isE = obj is MyEnum;
Enum en = val;

Enum version:

var valS = MyEnumS.EnumItem1;
valS = (MyEnumS)50;
//valS = 0; // cannot simulate this
object objS = valS;
bool isS = objS is MyEnumS;
IEnum enS = valS;

Some operations cannot be simulated, but this all shows what I intended to say... Enums are special, yes... how much special? not that much! =)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top