Why is TypedReference behind the scenes? It's so fast and safe… almost magical!

https://stackoverflow.com/questions/4764573

16-10-2019
|

Question

Warning: This question is a bit heretical... religious programmers always abiding by good practices, please don't read it. :)

Does anyone know why the use of TypedReference is so discouraged (implicitly, by lack of documentation)?

I've found great uses for it, such as when passing generic parameters through functions that shouldn't be generic (when using an object might be overkill or slow, if you need a value type), for when you need an opaque pointer, or for when you need to access an element of an array quickly, whose specs you find at runtime (using Array.InternalGetReference). Since the CLR doesn't even allow incorrect usage of this type, why is it discouraged? It doesn't seem to be unsafe or anything...

Other uses I've found for TypedReference:

"Specializing" generics in C# (this is type-safe):

static void foo<T>(ref T value)
{
    //This is the ONLY way to treat value as int, without boxing/unboxing objects
    if (value is int)
    { __refvalue(__makeref(value), int) = 1; }
    else { value = default(T); }
}

Writing code that works with generic pointers (this is very unsafe if misused, but fast and safe if used correctly):

//This bypasses the restriction that you can't have a pointer to T,
//letting you write very high-performance generic code.
//It's dangerous if you don't know what you're doing, but very worth if you do.
static T Read<T>(IntPtr address)
{
    var obj = default(T);
    var tr = __makeref(obj);

    //This is equivalent to shooting yourself in the foot
    //but it's the only high-perf solution in some cases
    //it sets the first field of the TypedReference (which is a pointer)
    //to the address you give it, then it dereferences the value.
    //Better be 10000% sure that your type T is unmanaged/blittable...
    unsafe { *(IntPtr*)(&tr) = address; }

    return __refvalue(tr, T);
}

Writing a method version of the sizeof instruction, which can be occasionally useful:

static class ArrayOfTwoElements<T> { static readonly Value = new T[2]; }

static uint SizeOf<T>()
{
    unsafe 
    {
        TypedReference
            elem1 = __makeref(ArrayOfTwoElements<T>.Value[0] ),
            elem2 = __makeref(ArrayOfTwoElements<T>.Value[1] );
        unsafe
        { return (uint)((byte*)*(IntPtr*)(&elem2) - (byte*)*(IntPtr*)(&elem1)); }
    }
}

Writing a method that passes a "state" parameter that wants to avoid boxing:

static void call(Action<int, TypedReference> action, TypedReference state)
{
    //Note: I could've said "object" instead of "TypedReference",
    //but if I had, then the user would've had to box any value types
    try
    {
        action(0, state);
    }
    finally { /*Do any cleanup needed*/ }
}

So why are uses like this "discouraged" (by lack of documentation)? Any particular safety reasons? It seems perfectly safe and verifiable if it's not mixed with pointers (which aren't safe or verifiable anyway)...

Update:

Sample code to show that, indeed, TypedReference can be twice as fast (or more):

using System;
using System.Collections.Generic;
static class Program
{
    static void Set1<T>(T[] a, int i, int v)
    { __refvalue(__makeref(a[i]), int) = v; }

    static void Set2<T>(T[] a, int i, int v)
    { a[i] = (T)(object)v; }

    static void Main(string[] args)
    {
        var root = new List<object>();
        var rand = new Random();
        for (int i = 0; i < 1024; i++)
        { root.Add(new byte[rand.Next(1024 * 64)]); }
        //The above code is to put just a bit of pressure on the GC

        var arr = new int[5];
        int start;
        const int COUNT = 40000000;

        start = Environment.TickCount;
        for (int i = 0; i < COUNT; i++)
        { Set1(arr, 0, i); }
        Console.WriteLine("Using TypedReference:  {0} ticks",
                          Environment.TickCount - start);
        start = Environment.TickCount;
        for (int i = 0; i < COUNT; i++)
        { Set2(arr, 0, i); }
        Console.WriteLine("Using boxing/unboxing: {0} ticks",
                          Environment.TickCount - start);

        //Output Using TypedReference:  156 ticks
        //Output Using boxing/unboxing: 484 ticks
    }
}

(Edit: I edited the benchmark above, since the last version of the post used a debug version of the code [I forgot to change it to release], and put no pressure on the GC. This version is a bit more realistic, and on my system, it's more than three times faster with TypedReference on average.)

Solution

Short answer: portability.

While __arglist, __makeref, and __refvalue are language extensions and are undocumented in the C# Language Specification, the constructs used to implement them under the hood (vararg calling convention, TypedReference type, arglist, refanytype, mkanyref, and refanyval instructions) are perfectly documented in the CLI Specification (ECMA-335) in the Vararg library.

Being defined in the Vararg Library makes it quite clear that they are primarily meant to support variable-length argument lists and not much else. Variable-argument lists have little use in platforms that don't need to interface with external C code that uses varargs. For this reason, the Varargs library is not part of any CLI profile. Legitimate CLI implementations may choose not to support Varargs library as it's not included in the CLI Kernel profile:

4.1.6 Vararg

The vararg feature set supports variable-length argument lists and runtime-typed pointers.

If omitted: Any attempt to reference a method with the vararg calling convention or the signature encodings associated with vararg methods (see Partition II) shall throw the System.NotImplementedException exception. Methods using the CIL instructions arglist, refanytype, mkrefany, and refanyval shall throw the System.NotImplementedException exception. The precise timing of the exception is not specified. The type System.TypedReference need not be defined.

Update (reply to `GetValueDirect` comment):

FieldInfo.GetValueDirect are FieldInfo.SetValueDirect are not part of Base Class Library. Note that there's a difference between .NET Framework Class Library and Base Class Library. BCL is the only thing required for a conforming implementation of the CLI/C# and is documented in ECMA TR/84. (In fact, FieldInfo itself is part of the Reflection library and that's not included in CLI Kernel profile either).

As soon as you use a method outside BCL, you are giving up a bit of portability (and this is becoming increasingly important with the advent of non-.NET CLI implementations like Silverlight and MonoTouch). Even if an implementation wanted to increase compatiblility with the Microsoft .NET Framework Class Library, it could simply provide GetValueDirect and SetValueDirect taking a TypedReference without making the TypedReference specially handled by the runtime (basically, making them equivalent to their object counterparts without the performance benefit).

Had they documented it in C#, it would have had at least a couple implications:

Like any feature, it may become a roadblock to new features, especially since this one doesn't really fit in the design of C# and requires weird syntax extensions and special handing of a type by the runtime.
All implementations of C# have to somehow implement this feature and it's not necessarily trivial/possible for C# implementations that don't run on top of a CLI at all or run on top of a CLI without Varargs.

OTHER TIPS

Well, I'm no Eric Lippert, so I can't speak directly of Microsoft's motivations, but if I were to venture a guess, I'd say that TypedReference et al. aren't well documented because, frankly, you don't need them.

Every use you mentioned for these features can be accomplished without them, albeit at a performance penalty in some cases. But C# (and .NET in general) isn't designed to be a high-performance language. (I'm guessing that "faster than Java" was the performance goal.)

That's not to say that certain performance considerations haven't been afforded. Indeed, such features as pointers, stackalloc, and certain optimized framework functions exist largely to boost performance in certain situations.

Generics, which I'd say have the primary benefit of type safety, also improve performance similarly to TypedReference by avoiding boxing and unboxing. In fact, I was wondering why you'd prefer this:

static void call(Action<int, TypedReference> action, TypedReference state){
    action(0, state);
}

to this:

static void call<T>(Action<int, T> action, T state){
    action(0, state);
}

The trade-offs, as I see them, are that the former requires fewer JITs (and, it follows, less memory), while the latter is more familiar and, I would assume, slightly faster (by avoiding pointer dereferencing).

I'd call TypedReference and friends implementation details. You've pointed out some neat uses for them, and I think they're worth exploring, but the usual caveat of relying on implementation details applies—the next version may break your code.

I can't figure out whether this question's title is supposed to be sarcastic: It has been long-established that TypedReference is the slow, bloated, ugly cousin of 'true' managed pointers, the latter being what we get with C++/CLI interior_ptr<T>, or even traditional by-reference (ref/out) parameters in C#. In fact, it's pretty hard to make TypedReference even reach the baseline performance of just using an integer to re-index off the original CLR array every time.

The sad details are here, but thankfully, none of this matters now...

This question is now rendered moot by the new ref locals and ref return features in C# 7

These new language features provide prominent, first-class support in C# for declaring, sharing, and manipulating true CLR managed reference type-types in carefully prescibed situations.

The use restrictions are no stricter than what was previously required for TypedReference (and the performance is literally jumping from worst to best), so I see no remaining conceivable use case in C# for TypedReference. For example, previously there was no way to persist a TypedReference in the GC heap, so the same being true of the superior managed pointers now is not a take-away.

And obviously, the demise of TypedReference—or its nearly complete deprecation at least—means throw __makeref on the junkheap as well.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow

Why is TypedReference behind the scenes? It's so fast and safe… almost magical!

4.1.6 Vararg

Update (reply to GetValueDirect comment):

Update (reply to `GetValueDirect` comment):