Question

It seems to me that a lot, if not most, compilers treat enumerated types as int underneath. In C/gcc, enums are compiled to int. In C#/Visual C#, you can change the underlying data type with something like:

enum Days : byte {Sat=1, Sun, Mon, Tue, Wed, Thu, Fri};

but the default is int (second paragraph) if you don't change the type.

For obvious reasons, ints are efficient. They fly through the ALU with ease and avoid useless string comparisons.

Upon first glance, it doesn't seem that interpreted languages have enumerated types built-in (Ruby, Perl, JavaScript, others I'm sure). They can be emulated though; i.e. in JavaScript.

My questions:

  1. Why do (seemingly) most compilers use int as the underlying type? Is the reason historical/performance-related?
  2. Are there languages that compile enumerated types to something other than int (or byte, long, numerical types...)? If so, why did they choose to do it differently?
Was it helpful?

Solution

The first part of your question will be difficult to answer because that decision was made long ago and one would need to ask the creators of languages such as C to know for sure.

Your second point is quite answerable.

  • Java created the enum type in response to the widespread use of int constants which do not behave well (Effective Java, Second Edition, Bloch, Item 30/Page 147). For example, you can multiply two integer constants in a nonsensical manner but you cannot do the same with two enum values. It is possible to multiple RED with BLUE when using integer constants, but not enums. What does that mean, anyway?

    In addition, this decision was made because the enumeration type can be "smarter." It can contain values as well as encapsulate behavior inside each enum value. Rather than having a switch over the enum values which is brittle and a maintenance problem, invert the logic and put the body of each case inside each individual enum. Now call a method on your enum variable, and use dynamic dispatch to abstract your switch. Now the enum has its values and behavior localized to the same class.

  • C++ 11 now includes the enum class construct which was done because languages such as Java implemented "intelligent" enums with good success. (The C++ Programming Language, 4th Edition, Stroustrup, Section 8.4/Page 218).

OTHER TIPS

I'll try to shed some light based on languages I'm most familiar with:

D:

Enumerators in the D language are just another way of declaring compile-time constants. They are are very flexible and allow enums of string type, which is not very usual. Control of the size of integer constants is also possible. See the bottom of the page in the previous link for several examples.

C++:

Old-style unscoped enums in C++ are not required to have the size of a C++ int object. They can be larger than that. Quoting this reference:

an unscoped enumeration type whose underlying type is not fixed ... in this case, the underlying type is either int or, if not all enumerator values can be represented as int, an implementation-defined larger integral type that can represent all enumerator values.

So if we compare the size of these two C++ enumerators:

enum Test1 {
    A1,
    B1
};
enum Test2 {
    A2,
    B2 = 18446744073709551615ULL /* UINT64_MAX */
};

The first one will be the size of a default integer (int) and the second, the size of a 64bit integer (long long int) to fit the large constant.

The new enum class introduced in C++11 allows the user to specify the size of the enum type and its constants. The constants are also scoped. They are very similar to a C# enumerator.

Why are integers the common case?

I'll speculate that integers are commonly the default choice for enumerator types in programming languages because of simplicity. The machine can only really deal with numbers. Even a text string is just and array of numbers interpreted in a special way by the program.

Licensed under: CC-BY-SA with attribution
scroll top