Boxing, what's your preference and which do you think is faster?

https://stackoverflow.com/questions/533379

22-08-2019
|

Question

In short, I think boxing is an annoyance. Let's look at some alternatives...

public class Box<T> 
    where T : struct
{
    public T Value { get; set; }

    public static implicit operator T(Box<T> box)
    {
        return box.Value;
    }
}

System.Int32 derives from abstract class System.ValueType which derives from class System.Object. You cannot derive from System.ValueType in C# but I would guess that the struct keyword does exactly that and the CLI recognizes these kind of type definitions as having pass-by-value semantic. Anyhow, when a struct is assigned to type of object boxing occurs. I don't wanna get caught up in boxing per se, instead I wanna get straight to it.

I looked at some of the IL generated by the C# compiler.

object obj = 1;

.locals init ([0] object obj)
L_0000: nop 
L_0001: ldc.i4.1 
L_0002: box int32 // Convert a value type (of the type specified in valTypeToken) to a true object reference. 
L_0007: stloc.0

Found this on MSDN...

A value type has two separate representations within the Common Language Infrastructure (CLI):

A 'raw' form used when a value type is embedded within another object or on the stack.
A 'boxed' form, where the data in the value type is wrapped (boxed) into an object so it can exist as an independent entity.

This have lead me to conclude that it should be equally expensive to write code like this...

var box = obj as Box<int>;
if (box != null)
{
    Console.WriteLine(box.Value);
}

If I intend to pass that same value around as an System.Object do I really wanna unbox and box the ValueType every time? My gut feeling is telling me no, but I cant really find good motivation anyone care to comment on all this blabbering?

EDIT

Anyone ever find themselves doing this? I realize that it might look bizarre but at one point I found myself in a position were I wanted to abstract computations based of several different representations. I did it like this and with lambda expressions. Its not really related to boxing but it sort of allowed me to treat any ValueType (this struct is conveniently 8-byte aligned) as if it were one single type "ReinterpretCast".

[StructLayout(LayoutKind.Explicit)]
public struct ReinterpretCast
{
    [FieldOffset(0)] sbyte @sbyte;
    [FieldOffset(0)] byte @byte;
    [FieldOffset(0)] short @ushort;
    [FieldOffset(0)] ushort @short;
    [FieldOffset(0)] int @int;
    [FieldOffset(0)] uint @uint;
    [FieldOffset(0)] long @long;
    [FieldOffset(0)] ulong @ulong;
    [FieldOffset(0)] float @float;
    [FieldOffset(0)] double @double;
}

Solution

I'm not entirely sure of your question here. Are you just asking whether your solution is perhaps better than normal boxing? It certainly has some appeal. If you're asking why boxing wasn't implemented this way in the first place, just remember that .NET didn't have generics to start with.

EDIT: Boxing is relatively rare with generics anyway. Don't forget that you'll still have to do a runtime cast if a reference to an instance of your type is passed around as object (which is usually the case anyway for boxing). Also don't forget interfaces - if a value type implements an interface, so does its corresponding reference type used for boxing. Your solution won't remove that use of boxing, as you can't make your type "pretend" to implement the interface. (You might be able to do something with the DLR, but by that time most of the point has been lost :)

OTHER TIPS

What we think is faster is completely irrelevant. Only the profiler is relevant when considering what is faster.

"If I intend to pass that same value around as an Object do I really wanna unbox/box every time?"

The short answer: No, you wouldn't want to do a lot of boxing/unboxing. It creates overhead: extra extra garbage and tends to be slow (although I think the speed has been optimized in later framework versions).

EDIT: However, if you "pass that same value around as an Object", without casting is back to the value type until it's needed, then it stays boxed the whole way without being unboxed.

But, as everyone said, you don't need to "pass that same value around as an Object" anyway. That's what generics are for, unless you are working on Framework 1.x. Boxing was more relevant back then when the BCL collection classes used System.Object and any value type that went in was boxed.

(As an aside, boxed value types are NOT unboxed if accessed through an interface.)

The title of your question misses what I think is the most interesting aspect: in what way is the system's boxing behavior different from that of a Box<T> type. There are a few differences:

(1) A boxed T will implement the same interfaces as a T, using the same code, but will mostly behave using class semantics rather than value semantics, but with a quirky Equals method.

(2) Mutating a boxed T will generally be nuisance in C# or vb.net, but a boxed T will never really be immutable since untrusted verifiable code is allowed to do it, even if it's awkward in some languages (it's easy in C++/CLI). Even types like boxed Int32 are mutable when boxed. By contrast, one could define an ImmutableBox<T>, which took a constructor of type T, whose fields would be truly immutable.

(3) Even structure types which are easily mutable when boxed (e.g. because they implement a mutating interface like IEnumerator<T>) and thus behave like mutable reference types, cannot implement Equals to mean reference equality (which would be the normal behavior for a mutable reference type) but generally use it to test equality of their transitory state. By contrast, if there were mutable and immutable box types, it would be possible for the immutable type to check equality of state, and the mutable one to check equality of reference.

(4) An implicit cast from T to Box<T> would not preclude the possibility of type T define an implicit cast to an interface type. By contrast, because all types are implicitly castable to Object, neither vb.net nor C# will allow for the possibility of an implicit user cast between a struct type and an interface.

(5) Without specialized compiler support for boxing, there would be no way for methods which presently accept a param-array of Object[] to automatically convert parameters from types like Int32 to Box<Int32>. On the other hand, adding a means of requesting that certain parameters be auto-boxed might be better than having implicit boxing everywhere. Note that if such a means existed, it could expecify that every parameter should be placed in a Box<T>, thus making it possible to distinguish between passing a T and a Box<T> (since the latter would be passed as a Box<Box<T>>.

Okay, there are several topics you're touching here. First of all, let's take a look at value types and why they exist. Value types are what you use when you need value semantics:

With classes, it is possible for two variables to reference the same object, and thus possible for operations on one variable to affect the object referenced by the other variable. With structs, the variables each have their own copy of the data (except in the case of ref and out parameter variables), and it is not possible for operations on one to affect the other. Furthermore, because structs are not reference types, it is not possible for values of a struct type to be null.

All numeric types, for example, are value types precisely because they need to have value semantics. If the variable x has value 17 and you assign x to y, then y will have its own value 17 and incrementing y won't change x to 18. Therefore, unless you have a good reason, use a struct only when defining a type that needs to have value semantics.

At the implementation level, value semantics are enforced by using in-line allocation. You can read more about it here.

This leads us to boxing. When does boxing happen? When you cast a value type into a reference type. You used the type object as an example, but with C# generics, that's something that should happen rarely in practice. A more frequent case would be to cast a value type into an interface; for example, casting a double into an IEquatable or IComparable. At any rate, if you cast a valu type into a reference type, boxing will and must occur.

What does really happen when boxing occurs? A copy of the instance to be boxed is made and placed on the heap, as an independent object, so that it can be safely referenced even when the original instance goes out of scope. If it wasn't for the boxing, it would be easy to get the CLR to try to access invalid memory and we all know that's a baaaaad thing.

So, is boxing good or bad? On the one hand it's good, because it allows you to safely cast value types into reference types when you need it. On the other hand, it creates "litter" -- short-lived instances of objects that get discarded and add to the work the garbage collector has to do. Is this bad? Only in some cases, such as developing XNA games. If this is your case, you'll want to avoid uncontrolled boxing; if so, I would also invite you to stop by my blog where I have some bits of advice on that topic.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow